My thoughts on ECML PKDD 2018

ECML.png

With the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) being hosted at Croke Park in Dublin, it was a great opportunity for us here at DigitalBridge to keep a finger on the pulse of state-of-the-art machine learning. Therefore I packed my bags and made the short hop across the Irish Sea to the Emerald Isle. These are some of my thoughts on the talks I found most interesting.

 Croke Park during a Gaelic football match. Not this many people attended the conference!

Croke Park during a Gaelic football match. Not this many people attended the conference!

The first day of the conference was composed of a set of workshops around specific applications of Machine Learning. Given the venue it was only fitting that it hosted the Machine Learning and Data Mining for Sports Analytics (MLSA). Given my background in Sports Engineering this was of particular interest to me, however I digress…

The Welcome Event keynote was presented by Corinna Cortes, the Head of Google Research in New York. She spoke about how Google are tackling the spread of disinformation, a topic that is high on the agenda for many large technology companies. She was keen to stress that Google simply serve search results, they do not filter in any way, but she did acknowledge that more could be done to inform people when the contents of a website was not credible. Therefore Google are using machine learning to verify the information on a website and inform the user of its trustworthiness. My peers in the audience seemed somewhat sceptical about this with interesting questions being raised regarding how the algorithm would handle satirical websites.

An useful aside from Corinna’s presentation was the introduction of the Google Dataset Search. This search engine is similar to conventional Google but returns only datasets. Here at DigitalBridge we’re often on the look out for new datasets; this tool should speed-up this process.

 The splash page for Google Dataset Search

The splash page for Google Dataset Search

As noted by Ilaria in her blog post on KDD2018, Deep Learning on Graphs is a current hot topic in the research community. This was not reflected in the papers at ECML, however I did attend a Tutorial on the topic. The tutorial included a brief history of the research field and some curious applications. As we are currently in the process of implementing Graph Neural Networks within the business, it was helpful to discuss this field with academics who have been working in the area for 20 years. It is also exciting to know, that at DigitalBridge we are working on cutting edge technology.

While Deep Learning has revolutionised machine learning and as such dominated the proceedings, one downside is that the models are non-interpretable black boxes. This is a particular concern in high risk applications where it is important to understand why a decision was made e.g. medicine, nuclear power stations. Subsequently there is a machine learning community fighting back. Cynthia Rudin presented her lab’s work, which argues we should be directly designing interpretable models rather than trying to interpret black box models. (Coincidentally and oblivious to the fact, I ended up sitting next to Cynthia on the bus to the Welcome event and got a sneak peek into what she would be presenting.) She proposed that for most real world datasets an interpretable model exists that achieves similar performance to any non-interpretable model. As such, performance should not be a factor when deciding the interpretablity of the model. E-commerce is not a relatively high risk application, however being able to explain decisions to our customers is appealing. Therefore we will be keeping an eye on the latest developments in the world of interpretable models and if applicable applying the methods within our new products.

Overall EMCL PKDD was a great way for the company to stay up to date with the state-of-the-art in the machine learning literature. It provided a chance to network with other researchers and inspired a collection of ideas that we can develop into new products. With next years event in Würzburg, Germany, I expect less Irish dancing and pilsner instead of Guinness, but hopefully we can present some ground breaking research of our own.

As of September 2018, we are recruiting for a Machine Learning researcher. If you like the sound of working on cutting edge machine learning applications and the opportunity to attend international conferences we would love to hear from you. Please see here for more information.

David Higham