We are delighted that we are attending FIMA 2018 as joint sponsors and speakers with our partner SAS. We are very much looking forward to demonstrating how we work together to guide organisations to power their data strategy and their analytical capabilities.
Simon Trewin, a Director and Co-founder of Kinaesis, will be presenting a talk on ‘The Difference is DataOps’ at 2.20pm in the ‘Quality and Remediation’ (B) stream. Don’t miss it!
We are looking forward to seeing our fellow members of the financial services community at the event!
We are delighted to announce that we are welcoming the award-winning data lineage solution, Solidatus, as our newest partner. Solidatus is a modern, specialised and powerful data lineage tool and we are looking forward to utilising their solution in several upcoming projects and propositions.
This marks what we hope to be the beginning of a great partnership as we look to further our work on intelligent data integration and instrumentation in the next year.
Simon Trewin, Director of Kinaesis, welcomed the new partnership by saying, “We are very pleased to have Solidatus join us as a partner. Their data lineage solution combined with Kinaesis services will help our clients better understand where and how data is being used in their organisations with a focus on improving quality and usage.”
Howard Travers, Chief Commercial Officer for Solidatus added “We’re delighted to be joining the Kinaesis partnership and look forward to working with them on some exciting projects. Solidatus was developed due to the genuine need for a sophisticated, scaleable data lineage tool to help companies meet their regulatory & transformational change goals. We’re thrilled to be working with Kinaesis and believe this partnership adds another important piece of the puzzle to our overall proposition.”
We would like to take this opportunity to officially welcome them as our Partner this month.
Any further queries should be directed to email@example.com
Not long ago, 'metadata' was a fairly rare word, representing something exotic and a bit geeky that generally wasn't considered essential to business.
Times have changed. Regulation has forced business to build up metadata. Vendors are emphasising the metadata management capabilities of their systems. The word 'metadata' almost sums up the post-BCBS 239 era of data management - the era in which enterprises are expected to be able to show their working, rather than just present numbers.
Customers are frequently asking for improved and a greater volume of metadata - looking to reduce costs and risk, please auditors and satisfy regulators.
The trouble with labels, though, is that they tend to hide the truth. 'Metadata' itself is a label and the more we discuss 'metadata' and how we'd like to have more of it, the more we start to wonder if 'metadata' actually means the same thing to everyone. In this article, I'd like to propose a strawman breakdown of what metadata actually consists of. That way, we'll have a concise, domain-appropriate definition to share when we refer to "global metadata" - good practice, to say the least!
So, when we gather and manage metadata, what do we gather and manage?
Terms: what data means
To become ‘information’ rather than just ‘data’, a number must be associated with some business meaning. Unfortunately, experience shows that simple words like 'arrears' or 'loan amount' do not, in fact, have a generally agreed business meaning, even within one enterprise. This is the reason why we have glossary systems; to keep track of business terms and to relate them to physical data. Managing terms and showing how physical data relates to business terms is an important aspect of metadata. Much has been invested and achieved in this area over the last few years. Nevertheless, compiling glossaries that really represent the business and that can practically be applied to physical data remains a complex and challenging affair.
Lineage: where data comes from
Lineage (not to be confused with provenance) is a description of how data is transformed, enriched and changed as it flows through the pipeline. It generally takes the form of a dependency graph. When I say 'the risk numbers submitted to the Fed flows through the following systems,' that's lineage. If it's fine-grained and correct, lineage is an incredibly valuable kind of metadata; it's also required, explicitly or implicitly, by many regulations.
Provenance: what data is made of
Provenance (not to be confused with lineage) is a description of where a particular set of data exiting the pipeline has come from: the filenames, software versions, manual adjustments and quality processes that are relevant to that particular physical batch of data. When I say 'the risk numbers submitted to the Fed in Q2 came from the following risk batches and reference data files,' that's provenance. Provenance is flat-out essential in many highly regulated areas, including stress testing, credit scoring models and many others.
Quality metrics: what data is like
Everyone has a data quality process. Not everyone can take the outputs and apply them to actual data delivery so that quality measures and profiling information are delivered alongside the data itself. It’s great that clued in businesses are starting to ask for this kind of metadata frequently. The other good news is that advances in DataOps approaches and in tooling are making it easier and easier to deliver.
Usage metadata: how data may be used
'Usage metadata' is not a very commonly used term. Yet it's a very important type of metadata, in terms of the money and risk that could be saved by applying it pervasively and getting it right. Usage metadata describes how data should be used. One example is the identification of golden sources and golden redistributors; that metadata tells us which data should be re-used as a mart and which data should not be depended upon. But another extremely important type of metadata to maintain is sizing and capacity information, without which new use cases may require painful trial and error before reaching production.
There are other kinds of metadata as well; one organisation might have complex ontology information that goes beyond what's normally meant by 'terms' and another may describe file permissions and timestamps as 'metadata'. In the list above, I've tried to outline the types of metadata that should be considered as part of any discussion of how to improve an enterprise data estate... and I've also tried to sneak in a quick explanation of how 'lineage' is different from 'provenance'. Of all life's pleasures, well defined terms are perhaps the greatest.