Is Data Provenance and Ethics one of Your Corporate New Year’s Resolutions? If Not, It Should Be!
By Simon Hay, CEO
2018 and the introduction of GDPR seems to be a fading memory for many based on what I see as a customer and regular shopper.
However, I don’t think we have yet seen the full power of GDPR enforced and ePrivacy is getting closer – data provenance and ethics must remain high on the business agenda. Research from Gartner reveals that risks surrounding data and analytics are the primary concerns of senior executives for 2019.
The same study found that the pursuit of digital business models to drive growth has increased the amount of data collected and processed by businesses at a time when public and regulatory scrutiny is very high. Despite this, only 37 per cent of organisations have a formal data auditing framework in place.
Yet the continued occurrence of high profile data breaches and increased public scrutiny has led to greater organisational data accountability, meaning that regular data auditing must become a high priority for organisations in 2019.
The DMA suggests that a good data audit should answer the following key questions:
- What data do you hold and why?
- How do you collect the data?
- How and where is the data stored?
- What do you do with the data?
- Who owns and controls the personal data?
- How long is the data held for and when is it deleted?
- Who is responsible for the data and processors associated with data?
- Do you have adequate technology / process to adequately manage data processing?
With the increasing reliance on business analytics, it is recommended that these eight questions are considered often, at least quarterly.
This is important because under GDPR ,organisations must know the provenance of their customer data. This means they must be able to identify from where a specific piece of customer information was sourced or be able to explain to a customer exactly why they received a specific communication. As data science becomes cheaper and more easily applied to decision processes this requirement becomes more difficult.
If the data being used is actual permissioned, opt-in data it is simple. It is more complex if the data has been modelled. How decisions are arrived at are normally pretty clear for regression or decision tree analysis which are linear, rule-based models. However, predictive models built using neural networks are much more ambiguous and understanding why the algorithm arrived at the conclusion is much harder to explain.
I have tried to ask my bank how an automated decision process was applied to my communications – six months later and with lots of correspondence I am still waiting for a clear answer, I suspect no one really knows the answer. However, it is this type of data science that is being used increasingly in targeting closer customer relationships.
This is why at Outra we bring clarity to deep learning through our proprietary approach to training data.
By drip feeding our predictive models we can pinpoint the features that make the difference meaning that all of our AI-enhanced predictions are GDPR compliant, ensuring that our clients can answer any difficult question that comes their way. Explaining how models work is a key part of reputation management – “the computer says no” doesn’t cut it under GDPR.
2019 is set to be an exciting year for business and marketing as fast-emerging developments in data science enable even greater relevance for customers. However, for these to become a reality getting your act together on data provenance and data science ethics has to be part of your plan.