Disclaimer – The views, thoughts, and opinions expressed in this article are my own, and should not be in any way attributed to my employer or any other organisation.
In today’s age of digitalisation and big data, modern day technology has immense potential to completely reshape the way we do things in interesting and unexpected ways. For audit professionals, such as myself, this could mean transforming the way we slice, dice and risk assess our audit universe, and how and when we conduct our audits. And for the braver souls amongst us, perhaps this could also mean redefining the role of an audit department, from an assurance provider to a function that delivers forward-looking insights and strategic advice on risk management.
Graph database is one such emerging technology that has the potential to enable the development of a “digital brain” at scale within audit departments, a brain that can connect the dots between business activities across multiple departments and allow us see the whole story as and when they unfold. This article is an attempt to provide a gentle introduction to this wonderful emerging world of graph technology, as well as offer some inspiration as to why audit departments should aspire to build a know all and see all type of “digital brain” for everyday use.
In business – and in life – our relationships matter a whole lot more than our individual skills or competencies. The same could be said about data. Even as the volume of discrete data points increases (which will continue to do so), the real value comes not from these data points, but from the connections between the data that’s collected. Take for example a hypothetical online retailer with two million Facebook “likes” and 2000 committed shoppers, and you are the owner of this company. What can you tell from the number of Facebook “likes” and loyal customers? Other than the fact that you run a smallish online shop, perhaps very little! But what if someone were to tell you that using graph technology you can easily join the dots between who likes your company’s Facebook page with who’s most likely to shop next; and draw the relationships between your Facebook promotions and items purchased by your most loyal shoppers (without compromising their privacy)? Which set of information would you prefer – the nice to know statistics that you can glean from the individual datasets in isolation (Facebook likes and customer base) or the treasure trove of invaluable insights that you can mine from the relationships between these datasets?
Back to the exciting world of auditing, what does this mean and what can we do with this graph technology? The answer is pretty simple – once you start appreciating the power of this technology and learn how to use it (there are a number of opensource as well as commercial graph databases that you can choose from), the possibilities are endless – from automating a whole bunch of data driven audits to deploying continuous auditing strategies that only a while back many would have dismissed as mere fantasy.
Not convinced? Have a look at the figure below – which, if I may add, I have hastily put together for this article (so please do not take it as gospel). It is a “graph” view of a collection of business-critical datasets in a typical Wholesale Bank, and the relationships between these datasets. The green rectangles (referred to as “nodes” in graph terminology) represent the standalone datasets, and the directional blue lines (referred to as “vertices” in graph terminology) signify the relationships between two neighbouring datasets.
Figure A – A graph view of the relationships between datasets in a typical Wholesale Bank. Each dataset can have one or more data attributes (for example, the “Sales and Traders” dataset could be made up of employee details, trading mandates and limits, role etc.)
No matter whichever industry you work in, you will find similar relationships in the datasets within your organisation. For a variety of reasons however, which I will not dwell upon in this article, present and past data analytics efforts within the audit industry have largely focussed on analysing datasets in silos, and to an extent exploring some simple relationships in data that can be interrogated using Excel “vlookup” or SQL “join”. But this is not good enough, and a more comprehensive analysis of the relationships in data is vital for audit professionals. This is because when things go wrong (I am referring to major operational risk events, such as rogue trading or accounting fraud, which can impact a company’s bottom line), more often than not, this is due to control failures in multiple business areas. Moreover, red flags in a standalone dataset may not tell you much – it could be something or nothing. But if you query the relationships in your datasets and discover similar red flags in other connected datasets, you would know immediately that something is not right.
This is where a graph database comes into play. A fully featured graph database will allow you to upload (or stream in real-time) a discrete collection of datasets and bind them together (by specifying the rules that connect two neighbouring datasets) into a connected graph. The beauty of having such a repository of connected datasets (aka “digital brain”) is that you can not only query the relationships between two neighbouring datasets, but also the relationships between datasets which are set farther apart by other intermediary but connected datasets (e.g. in “Figure A” above, the “Sales and Traders” and the “Regulatory Capital” datasets are connected to each other via the “Trades” and “Market Risk” datasets).
To illustrate this further, let me use the graph view of a collection of datasets in a typical Wholesale Bank as an example (“Figure A” above) and list a sample of very pertinent questions that you can easily find the answers to using a graph database: (i) are there traders who regularly cancel or amend their trades, who are the counterparties for these cancelled or amended trades, were these trades ever confirmed with the counterparties, the desks these traders work for, and what do their P&L look like; (ii) which counterparties are failing to timely settle their trades, what is our credit exposure to these counterparties, and which salesperson or traders are doing business with them; (iii) is a trader’s year-end remuneration commensurate with the risk (and risk horizon) they are taking, the P&L they are generating, and the cost of regulatory capital they need to support their business activities; (iv) which desks are growing too fast (from the perspective of risk taking and P&L generation); (v) are the market risk numbers in-line with the P&L; (vi) which traders’ P&L spike or fall at certain times during the year, such as quarter-end or year-end (an indication that the traders could be manipulating the P&L to meet their targets); and (vii) are we correctly disclosing our trading activities to the regulators. I can go on and on, but I guess you get the point.
Fascinating as this idea of a “digital brain” might sound to many of us, I am pretty sure that there will be auditors amongst us who would argue that such an enterprise is best left with the business, and that auditors should focus on what they do best – i.e. the good old school way of assessing risks and controls. When it comes to analysing standalone datasets, their argument has some merit. Because a standalone dataset is almost always owned by a specific business line; and the business should therefore be analysing that dataset as part of their day to day control activities. However, when it comes to exploring cross-functional relationships in datasets, I will have to respectfully disagree that we should leave it for someone else to take care of. The reason being that, unlike audit, very few functions have the mandate and authority to look under the hood of business activities in each and every part of an organisation.
On that high note, let me wrap up this article by wishing my readers a very happy, prosperous and innovative 2019!