Well, you've made it this far, welcome to the last lecture and Stanford GSB Gen 544. How software eight finance, the topic of this last lecture is data asset management and conclusions. So let's begin with data, it's cliche by now data is fuel, it's certainly the fuel for the financial industry. Data is abstract of course and to make it more tangible facts and images So 100 years ago, we had ticker tapes and the tapes would come out of machines and they would show a stock ticker or stock symbol, and then they would show buys and sells and how many lots or a lot was 100 shares. And of course the price, and certainly the ticker tape made excellent confetti when it was time to throw a ticker tape parade. Getting into the modern era IX, which we've discussed introduced a 38 mile coil of fiber optic cable and to reduce the effects of latency Arbitrage in the regime of time price priority for U.S Stock exchanges. And if you wonder what a 38 mile cable looks like, coil of fiber optic looks like, there it is. And then, well, if you want to reduce transmission time, you definitely want a straight line and a great way to get it completely straight line is to set up two microwave towers. So this is a pre-construction microwave tower Aurora LLP, near the CME data center and the funds that built it wanted direct and early access. Also note that Aurora, Illinois is the hub not only at the CMA data center, but also when Garth from Wayne's World Data remains the source of edge for many people want more data. They wanted it finer time intervals, and there's really just no stopping. So you not only want to know what the prices of which things traded, but you also want to know the top of the order book at any points in time. And maybe you want to know the entire order book at every point in time constantly scaling up the requirements for faster ingestion and processing of data. One difference perhaps is that finance is required access to large data sets for longer than most other industries. And it's really no accident that I entered finance in 93 Wall Street had the money it had the requirements. Significantly compute power and data stores were just barely good enough to meet those requirements. It's perhaps not random that I majored in biochemical sciences and also computer science and medical information sciences. After that ended up veering off into Wall Street. Goldman Sachs told its headhunters in 93 identify entrepreneurs in Silicon Valley with PhDs from Stanford and computer science and I was just on the list. It's actually hard to create that Venn diagram in the days before LinkedIn, but the headhunter did it found me and that's how I ended up there looking back, I can see. That something interesting, which is that what Goldman needed to do in the 90s build data and risk analytics for initially for the foreign exchange trading business? Well think about it, a foreign exchange spot and forward trade is actually a very simple widget to fully specify the trade, you need the two currencies. The two trading counterparties you need the exchange rate and you need the delivery date and that's the full specification of the widget. There'll be some legal and contractual documents that to go around it that were standard even back then. And so with very simple widgets, once you put a bunch of those widgets into portfolio when you're buying and selling them, all day and you're doing that and multiple trading centres simultaneously. And you need to calculate all kinds of risk and you don't know what the future exchange rates going to be. There's uncertainty, there's probability distributions, even very simple widgets as an amazingly complicated problem. We could just barely solve it back then back. Something that prompted my immediate hiring was that the initial version of sec DB, which was an in memory data store, you can think of it as no SQL was about to blow out of the 16 bit address space. And so we urgently needed a port to some spark stations Which was 32 bits and that was my very, very first product project. And that was really an early read on how data was growing exponentially. It was terrifying with that exponential growth, how quickly we're going to hit the absolute limit and blow up and so we needed those extra 16 bits of address space. If you think about the model requirements in the life sciences, what was hard enough to do we did a golden when the widgets could be specified by five separate parameters, but in life sciences, you want aspirationally fully. It's a model cell or a tissue or an organism or population of human organisms interacting with non-human organisms and so there were many iterations of Moore's Law. Way from being able to do that the compute power was so well, in the 90s, you could just about a dream of tackling problems and finance, you couldn't really do much more than solve toy problems in the life sciences. And so I went to Wall Street Let's talk a little bit about the landscape of financial data providers almost all of these figures are market capitalization in US billions of dollars. And you can see the two that dominate the landscape are S&P, and Bloomberg Roughly 60 to $70 billion of market cap then there's three players at about half that market cap. There's Thomson Reuters now refinitiv, there's MSCI and started off as Morgan Stanley indices. And then $26 billion of there's market, which started off as a Goldman Sachs, psi investment and a consortium with a number of other sell side firms. I list the exchanges and why well data revenues for the exchanges are growing fast over the last 10 years, they grow at a kegger of almost 12%. Now for the ice, NASDAQ and CBO IE those revenues are nearly $5 billion annually, more than half of the total revenues of the exchanges so obviously data is hugely important. You look at the market cap of those exchanges added up. That's 140 And easily over half probably well over half that market cap is purely attributable to data and so that makes the exchanges collectively at least as important and in value as S&P and Bloomberg. And then in the middle, I put Goldman Sachs, sec DB and tsdb, I'll say a little bit more about those entries into the data business. So let's start with With Bloomberg, the legend that goes back to 1981 phibro bought Salomon Brothers and Mike Bloomberg received a $10 million settlement. He was a Salomon partner at the time. He used the settlement money To start, a company called innovative market systems later renamed it to Bloomberg LP name it still has today. Late in that year, Merrill Lynch bought 20 of the new terminals and a 30% equity stake for $30 million, there was a return. In return, there was a five year restriction on Bloomberg selling terminal to others that restrictions later relaxed. In the depths of the financial crisis, Merrill sold its remaining stake, which was 20% back to an entity owned by Mike Bloomberg, for a little over $4 billion. Valuing Bloomberg at 22 and a half billion dollars. There's a Bloomberg terminal circa 1982. Now the terminals really window on the screen but look and feel the same and something I think is brilliant and maybe a bit quirky and Bloomberg still has that classic visual look and feel except. It's now implemented in HTML5 rolling up to 2019. The annual revenue of Bloomberg estimating is about 10 billion dollars. So you can scale that against some of the other businesses we about and we're going to talk about and that putting a multiple on it makes it worth neighborhood of $1375 billion, about as much as Goldman sachs of course it's not public. So this is all just some potato map. Fascinating thing about Bloomberg is it is successfully defended the price point, which is about $28,000 per person per year. Over decades, very few firms managed to do that. If you talk to Bloomberg executives, they'll say it's because they're constantly furiously, adding more functionality, more data, more analytics, more capabilities to that terminal. Some would say the main value of Bloomberg is to deliver that facility. It really is the place where the buy side communicates with the the sell side and many in the industry. I would question the analytics of Bloomberg saying Bloomberg doesn't have skin in the game it doesn't actually have capital at risk. The analytics are quite good and in some markets the Bloomberg implied volatility. For instance will be what everybody quotes in the market. So the analytics have become the standard for value. And so it's now just quibbling where value really is so think about Bloomberg as having roughly 350,000 people who are subscribers their companies are generally paying. And you get to about that $10 billion figure, an amazing business, one for which I have total admiration. Now let's talk about Reuters. So in 1986, of course, reuters has a long, long history but I'll start at six. The advanced reuters terminal really was the standard for foreign exchange and commodities trading. To my knowledge, it never really caught on in equities and fixed income but if you are on the JR and trading floor, he saw the terminals everywhere. It was obviously, each terminal was really a Windows machine, pretty old version of Windows 10 update very often, they would be connected to a Unix server that would be on site. And then that Unix server would have what we then thought of as a high speed connection to one broadband line, connecting it to the reuters. Network foot high speed and context is one and a half megabits per second. Now, many of us have gigabit connections at home in 2008, the Thomson Corporation. Requires reuters, renamed itself the Thomson reuters, at the same time it rebranded the reuters terminal and called it icon, later on,Thomson reuters. Just ten years later spun, I have it's been its natural risks Unit, including the Reuters terminal, calling it refinitiv. And in that transaction sold a 55% majority stake to blackstone and therefore valuing the business at $20 billion had $6 billion of annual revenue at the time. And then in 2019, London Stock Exchange notice exchange agrees to buy refinitiv in August of that year for $27 billion, nice bump up from 20. But that transaction hasn't closed yet. S&p Global and other one of the big the big businesses It began as the McGraw Hill publishing company in 1899 10 years later combined with Hill to become McGraw Hill. Those of us my generation will remember McGraw Hill is the publisher of almost all our textbooks. In 1966, two years after I was born, McGraw Hill purchased the credit rating agency Standard and Poor's. Many years later, in 2016. The company actually changed its name to s&p Global. Another highlight that's important to me. In 2018 s&p Global bought kensho kensho is backed by principal strategic investments you've heard from the head of psi for many years Darren Cohen. I looked after psi for many years. We discovered kensho early on invested early when it was just a small number of people coming out of Harvard in something resembling a garage in Cambridge, Massachusetts, applying machines. Learning to create baskets of stocks that outperformed in and around events where the events could be somewhat amorphous such as rising geopolitical risk in the Middle East. Kim, should we find baskets of stocks now classic psi strategy playbook. We introduced Kensho to our clients to our competitors. We had hackathons we created API's where a client could create a custom basket and kensho and through the API, drop it into the golden sex trading business and tailor it and refine it and execute the basket. That all was part of a virtuous cycle that led to Ken shows. Transaction with s&p Global which to my knowledge is the biggest exit so far for a pure Machine Learning Company going up to 2019. A little under $7 billion of annual revenue and S&P 3.1 billion of it is from the ratings business. Almost all corporate bonds and all sovereign bonds get ratings from S&P. Not only SLP course $2 billion of the revenue comes from the gold market intelligence business and a little under a billion dollars of the annual revenue from the SLP Dow Jones indices business. It's a joint venture with Dow Jones indices, which I showed on the prior slide, where SLP has the majority share. Let's talk about Goldman Sachs not by any means known for being in the data business, though it is in a recent entry and data has for decades been at the core of what what Goldman Sachs does so a little pocket history of Goldman Sachs, founded by Marcus Goldman in 1869. Note the name you'll know the name Goldman. The name Marcus, of course, is the consumer brands and I remember the early discussions where we were advised to take a name and then pour meaning into it to create the brand and why not choose name of one of the founders in this case, the first name. Don't be too surprised if you see names of other founders or family members show up another Goldman products and services. In 1906, Goldman took Sears Roebuck public, the 1970 Penn Central bankruptcy almost took down Goldman Sachs. In 1981, and fascinating move, Goldman acquired Jay Aaron and company. But in that transaction all of the Goldman partners became Aaron partners, Aaron partners became Goldman partners. The company's had quite different cultures of Japanese peer trading business started off as coffee trading in New Orleans later on moved into gold and foreign exchange and derivatives trading famous Lloyd did not get a job when he applied to Goldman Sachs he went to jr and did get a job and you know the rest of the history. In 1986, Goldman formed the asset management business g Sam in 1993 began the development of TSDB. The time series database in SecDB the securities database firm had its IPO in 99. Lloyd Blankfein became the CEO in 06, when and Paulson left to become treasury secretary and 2015. Golden watch the consumer business. You've heard from Elmer his smile about the consumer business in 2018 David Solomon became the CEO. Now say a little bit about SecDB and TSDB. In those days, Lloyd was having a hard time getting the risk management system. He wanted the conversations with the Technology Division. Were not proving fruitful. One analysis is that that Boyd wasn't a computer scientist and didn't have the particular skill set of creating a functional specification. And then the software engineers at the Technology Division hadn't experienced the trading business. They were talking past each other. While I'd had the inspiration to go and get his own engineer Those days you go to Bell Labs, you hired me, Armen FNS Ian's Armen later hired me. Armen created a group of what we would now call data scientists. He called them strategists or strats and branded them separately from Technology Division and they were in the trading business and part of the trading desk. I was roughly strapped number 12 Back in 93, and Armin had some inspirations that like all great inspiration seemed totally obvious and retrospect, but were not obvious at the time. And he had a platform vision he said, let's create some software And let's put all the data all the time series, all the models, all the risk all the reports in one place, one repository, one database and that way whenever we're wondering where something is, well, we immediately know the answers in the one place. We put everything, and started with the time series that seemed more attractable. And service the time series database. And then later increased the aspiration to include all the trainings and positions and models in that was sick. Testdb one of the additives was we never met a time series we did name capture and clean and other one was put everything in sec dB. Now of course the software just didn't exist at the time. But to make this possible, the compute power was barely at the level where it was doable. And certainly the software architecture wasn't there yet. And so we created all of our own, we created what you'd now call a no SQL. We created a language called slang securities language, that if you step back look, squint a little bit. Looks an awful lot like Python. It was a dataflow language optimized for counterfactual questions. In tsdb. We created our own repository data and our own analytic analytics language. That was quite powerful. You could, for instance, create a page trading strategy and you can immediately visualize it. And then you could just put that trading strategy inside a function that would calculate the cumulative profit and loss of executing that strategy over a period of time. So if you want to know the cumulative P and L of x q 20 day historical moving average trading strategy on dollar yen. So when a stock drops below the 20 day historical moving average, you buy. And when it goes above you sell and you could in one line see the cumulative p&l which by the way at trading strategy worked really well for a long long time and then it flatlined which is the fate of. Essentially all trading signals, maybe all trading signals. Eventually they get arbitrage away. Many traders who left GSA mo they say they miss all kinds of things and certainly the people. They also say I really missed that part. So, well, there's an answer. So now TSDB and the plot tool itself are available as a series of APIs, we referenced them. Back in module two, some of you have gotten access to the API and there's also an html5 user experience. So those traders who miss the Miss Patil can now have pottuvil and just making all that available as a subscription service. So now let's talk a little bit about disruption of the date of business. Well the disruption is arrived as traders rely less and less on legacy front end systems and in some businesses, those businesses rely less and less on traders and, and and salespeople. There's been a massive emphasis on streamlining the workflow. And asking do we really need people talking and typing in the middle of these loops of workflow?. Or perhaps could we stitch together by side exchange clearing house and sell side systems with API's and so need no need for people in the middle to click and press buttons. Or talk on the phone or type in Bloomberg chat rooms and the rise you're seeing the rise of communication tools. And we'll talk a little bit about Symphony and those tools are generally offering API's everywhere. Symphony raised $165 million At a $1.4 billion valuation in 2019, it has more users than Bloomberg. The last estimate I saw was about 450,000 regular users at a much. Much, lower price point. And the idea behind Symphony is to make it ubiquitous across the financial system. Not just for traders or front office people, but for all people working in the financial, ecosystem anecdotally, the usage of symphony has exploded during the pandemic. Now a little bit of the back history some industry participants, and you might be looking at one of them, wanted Bloomberg to make available all the capabilities of the Bloomberg terminal. As API's that didn't quite happen more recently, Bloomberg is on that journey. API is everywhere. And there's a fascinating story behind it. slack did not have encryption built in, and we made the assessment that you couldn't graft the encryption and data privacy and protection. The regulated Institute's, institutions required onto slack it needed to be built in from the beginning and I have strong conviction on that, on that point, information barriers. Last count. I had those there are over 150 information barriers at Goldman Sachs. Not just the classic. So called Chinese wall do not direct the purchase and sale of securities while in possession of material nonpublic information. There's many other information barriers who can say what to whom and when. And that's. That's an information barrier. Symphony allows individual institutions fully to specify all their information barriers, and it's built into the platform. And all the communications are encrypted which is not the case on other platforms. And so that's part of the design of symphony. In addition, there's demand for all kinds of exotic datasets. Many data sets, closing prices for listed equities or just now commodities and get them from a lot of places. But if you want intraday implied volatility surfaces for individual stocks and indices, there aren't too many places where you can get that. And Goldman Sachs is one such vendor. And there's a lot of sharing of data sets and partnerships and channel distribution of data sets. And again, all of that is based on API's absolutely everywhere. There's a number of new companies and they are as numerous as sans at the seashore. Some of them are established yesterday, and they're all working on big data data management machine learning as a service, for the financial industry and for other industries. Of course Palantir with its estimated annual revenues of over a billion dollars is an important and a major player in the space. So privately held provides custom software and data solutions for large financial players, many use cases and underwriting anti money laundering, there's been mixed results as with any service, some people love it, other people have looked at last. You can read all of the stories. The major question is, is it a software company is there some repeatable tools that Palantir is packaging and making available or is it a software driven management consultancy? And it's not clear to me how one would even answer that question, but it's important question. And of course it will be valued quite differently depending on which the two it is. Many others I won't go into them in detail. This is not by any means intended to be a representative sampling of the sands of the seashore. Cognizant was spun out of Dun and Bradstreet in 1996. It's a data analytics services. More recently, H2O it's an open source distributed in memory machine learning platform. Splunk captures indexes, correlates and visualises real time data. Mu Sigma is an Indian management consultancy that offers data analytics as a service. Much more recently Reality Engines. One of the advisors induces a deep learning network architecture for standardized use cases. It's a complex art to figure out what kind of network architecture to use and reality engines figures that out. It's fascinating intellectual property there. One of those use cases is fraud detection. And you can think of reality engines as machine learning as a service, but with heavily standardized about 10 or so heavily standardized use cases. So it does not need to be done or micro customized for every vertical. And let's see, let's go on and talk about some others. App Annie is an evolving standard for investors who want to understand the usage of apps and the ranking and trends. Geospatial Insight use the satellite data and amalgamates it to track oil inventory levels and storage facilities certainly when price of oil went negative, you can imagine all the interest in that data sets. Thinknum tracks when companies are hiring employees, interacting with customers, moving product and many other indices all in one place. And of course, we've already talked about KENSHO, which is using machine learning pattern recognition techniques to create what they're calling market indices for the new economy.