May 4, 2017
The Prediction Game: Data Science Firms vs. Pollsters on Sunday’s French Presidential Election
By Brent M. Eastwood and Mathew Burrows
Data Startups—First Time Lucky?
Predata, with its main office in New York City, mainly analyzes elections by what it calls “digital conversations.” These are measured by the candidates’ Twitter activity plus analysis of candidate searches of Wikipedia. This enabled Predata to make an accurate call on Brexit. Its Brexit coverage showed the Remain camp losing the digital battle as early as May 2016, a month or more before the vote. The startup recently added an election metric based on YouTube engagement to measure the candidates’ short-term digital presence.
In a blog post for the French presidential election released on April 26, Predata noted Emmanuel Macron’s 20-point lead in the polls over Marine Le Pen and found that “little in Predata’s digital signals today offers a credible basis to question the polls’ prediction of the overall outcome.” French polling has been extremely accurate, but there were many reasons the final voter tally could be closer than 20 points.
Predata also determined that Macron was performing better on YouTube interactions in the first week after the opening round. Online conversations for Le Pen had improved on April 26, but were not enough to make a big difference in the polls, according to the firm. Predata also revealed that in geographical swing regions where both candidates run very close, Le Pen’s digital presence was not moving the needle after she finished second in initial voting. It is “not a good sign for the (sometime) [National Front] leader, who needs to win convincingly in the swing hinterlands if she’s to have any hope of breaking the tyranny of what she earlier this week called ‘the rotten old Republican front.’”
Predata concluded on April 26 that “the overall digital contest is fairly even,” but that voter apathy and people refusing to vote could influence the outcome. It noted that “abstention could hurt Macron more than Le Pen.”
In a research update on April 30, Predata again predicted that Le Pen would do better than what polls have suggested.
Predata also revealed ... Le Pen’s digital presence was not moving the needle after she finished second in initial voting.
“There’s nothing in our signals to indicate Le Pen will win; the situation remains as we described it late last week, with Macron dominating the digital conversation once YouTube is included, but Le Pen edging the contest if Wikipedia and Twitter…are the sole signal inputs.”
GovBrain is located in Washington, DC and predicted Donald Trump would win both the primary and general elections. It uses a patent-pending “Trend Meter” inside its machine learning and artificial intelligence system. The GovBrain system searches data from nearly one thousand government, regulatory, and legislative sources along with political, financial, and technology news sites from around the world. The GovBrain trend meter is self-contained, and unlike Predata and other entities, it does not incorporate analysis of social media, Google and Wikipedia searches, YouTube interactions, or polling data – its predictive signal is unique. To use the trend meter to begin tracking an election, one simply types the name of the candidate, clicks enter, and watches the results over time.
In its trend meter analysis conducted during the week of April 24, based on two readings per day, GovBrain also found that Macron maintained a steady lead that was similar to the initial polling data. According to the trend meter, Macron’s biggest lead was on Tuesday afternoon, April 25. Le Pen’s best day was on Wednesday, April 26, when she cut into Macron’s lead by about 3 points overnight. She also overperformed on Wednesday afternoon, but Macron came roaring back with a much better growth velocity (rate of 24-hour growth) all day, Thursday, April 27.
On Friday, April 28, Le Pen slightly cut into Macron’s lead with a better growth velocity. Le Pen’s improved Friday performance could have been the result of voter fears after news about another suspected Islamist shooting of two police officers in Réunion the day before. Le Pen continued her momentum over the weekend on April 29 and April 30. Trend meter readings found her growth velocity was higher than Macron’s last Saturday and Sunday. However, Macron righted the ship beginning on Monday, May 1, and continued to put up good numbers on Tuesday.
French Pollsters Taking Advantage of Lessons Learned
The success of French pollsters in the first round was due to their better demographic modeling, understanding, particularly, the preferences of first-time voters, seniors, the less well-educated and rural poor where there is poorer internet coverage. They had a better sense of abstention rates and constantly were adjusting their estimations for voter turnout based on evidence of rising enthusiasm. They also had the advantage in learning from the mistakes of pollsters who largely failed in predicting Brexit or the Trump victory. They saw how badly pollsters underestimated Trump’s support in rust belt states.
Pollsters’ predictions on the first round were all within one percentage point or so of the actual result.
Predictions for the Last 48 Hours
GovBrain’s trend meter had Macron leading Le Pen before, during, and after the Presidential debate. This was based on five hourly readings on Wednesday. Le Pen recovered from the sting of the debate and improved throughout the day on Thursday, but not enough to cut into Macron’s lead. Le Pen failed to build on her momentum from the weekend and losing the debate did not help. She may have cut the lead down to 15 points Monday morning, however, the lead is back up to 18 to 20 points post-debate, according to the trend meter Thursday.
"According to [GovBrain's] trend meter, Macron’s biggest lead was on Tuesday afternoon, April 25.
It is difficult to see a path to victory for Le Pen, according to GovBrain. The firm believes Macron’s handlers will urge him to refrain from any controversial comments. He will likely only appear in front of hand-picked supporters. They even have time to release an ad pointing to Macron’s superior debate performance. Plus, they already announced former US President Barack Obama’s endorsement of their candidate. That leaves Le Pen’s campaign staff to ponder the aftermath of a landslide loss. Their only choice is to energize the base and maximize turnout.
GovBrain offered this analysis on the election:
“This race is about geography and regionalism. Both candidates face a divided political map. During the first round, Le Pen did poorly in Paris and Paris remains her biggest problem. She also underperformed in Western, Southwestern, and South Central France. She must do better in cities in those regions such as Rennes, Nantes, Poitiers, Toulouse, and Bordeaux. Le Pen needs to hold final rallies and focus grass-roots efforts in urban neighborhoods with high unemployment. Lack of economic opportunity is the number one issue for many of her voters. Since she is operating so far behind, she will have to take some risks with her messaging and that could be a difficult sell to voters who are planning to abstain from voting. Le Pen already pulled a stunt last Wednesday by showing up at the same campaign event as Macron. Our data showed that this stunt may have enabled her best day on our trend meter. That’s been her only highlight during the second round.”
“Macron just needs to consolidate support where he is strongest – in the more cosmopolitan cities and suburbs – and lock down backing from the urban intelligentsia. He doesn’t have to expand his political map or change his message. He can continue to operate as the voice of reason in an uncertain world – a message that has served him well so far. Another spectacular terror attack in France could give citizens increased doubt about Macron’s experience level and hurt him in the upcoming legislative elections should he win the second round as expected.”
Predata’s post-debate analysis found that the live showdown between the two candidates is not likely to affect the final outcome. Predata’s signals “don’t show sufficient digital momentum for Le Pen that we think the polls are wrong in pointing to a Macron victory, but they do suggest the final outcome might be closer than the 60%-40% split that most polls continue to show.”
Predata also determined that Le Pen did enjoy “digital dominance” in some geographical swing regions and that low turnout will not help her make up much distance. However, Le Pen actually scored better than Macron based on Twitter activity and Wikipedia searches this week. When YouTube was added, Macron darted back into the lead, according to Predata.
Predata believes that Le Pen could close the gap in the coming days to avoid a 20-point loss. “The furious intensity with which Le Pen’s local representatives are continuing to broadcast her message in these parts of the country, and the comparatively lackadaisical pace of messaging from Macron’s camp, creates some basis to suspect the polls may be underestimating Le Pen’s true levels of second round support.”
It is difficult to find a poll that has not predicted a blowout win for Macron. On Thursday, a poll aggregation by The Telegraph-UK had Macron winning 61.2% to 38.9%. The Cevipof/Ipsos/Sopra Steria poll has Macron up 18 points. Harris Interactive has it 61 percent to 39 percent as well. OpinionWay said Le Pen trails by 20 points.
Brent M. Eastwood, PhD is the Founder and CEO of GovBrain Inc that predicts world events using machine learning, artificial intelligence, natural language processing, and data science. He is a former military officer and award-winning economic forecaster. Brent has founded and led companies in sectors such as biometrics and immersive video. He is also a Professorial Lecturer at The George Washington University's Elliott School of International Affairs.
Mathew Burrows is director of the Foresight, Strategy, and Risks Iniative in the Atlantic Council's Brent Scowcroft Center on International Security.