The Rise of the Quants

My favorite anecdote in Michael Lewis’s The Big Short is about one of the first people to make big bets against mortgage bonds, a Deutsche Bank trader who liked to brag that he employed China’s second-smartest mathematician. According to Lewis, the trader talked about the mathematician as if he were “a pet tied to his desk.” When anyone doubted the mathematician’s claims, the trader would say: “How can a guy lie who doesn’t even speak English?”

It’s hardly news that one of the world’s biggest brains dedicated his life to modeling the havoc created by a million subprime mortgage brokers. If “A Beautiful Mind” had been set in the 1980s, it would have ended at a hedge fund, not in a Nobel Prize. It’s sort of a tragedy. I’ve sometimes wondered if God calibrated the size of our brains and the amount of fuel in the sun to give us just enough time to figure out the universe & send a space-ark toward a new galaxy, but the only guys who could figure this out are working for Wall Street.

What’s interesting now is that they’re fanning out to other industries on Main Street and Sand Hill Road. The most coveted employees in Silicon Valley may no longer be software engineers, but mathematicians. And the reason is simple: we now record so much data about what people are doing within the vast virtual world of the web that our biggest challenge is just making sense of it.

My last trip down to the Valley was a field trip set up by one of our investors, Greylock Partners. I met a mathematician who once developed models for predicting the likely locations of nuclear weapons in Iraq. He’s now spending his time more profitably at a social networking site, working out when to send diffident users a “win-back” email. Later that same day, I met a Chief Marketing Officer at one of the world’s largest retailers’ websites. He had no interest in my questions about branding. The team he ran was a team of mathematicians.

And the world he described was fascinating. Imagine if, every time you walk into Anthropologie or Macy’s, a guy with a clipboard follows you around, noting the path you take through the racks, the clothes you pick up, the ones you try on, the ones you get in line with, and the ones you finally buy.

He measures how much time you spend on each floor, and he comes into the changing room with you to measure how long you spend at the mirror sizing up each shirt. The next time you stop in, the whole place is re-arranged so that you don’t have to walk as far, or see clothes you don’t like, and its decor has shifted in subtle ways, which somehow makes you want to stay longer.

This is what’s happening within every well-run website. Just ask Jeff Hammerbacher, who built the data storage system for Facebook. He visited Redfin last month to talk about Hadoop, an open-source data storage system built to support the analysis of vast data sets. Jeff observed that most data now is collected by machines monitoring other machines, not by a machine collecting the input of a Macy’s sales clerk, or storing a letter to your mother.

The difference in volume between machine- and human-produced data is as great as the difference in volume between Model Ts and their hand-built predecessors. With a single setting adjusted, a web server can increase the amount of data it sends to another machine by a thousandfold.

It dawned on me then that Jeff hadn’t come to talk about how Facebook stored all your messages and status updates; he came to talk about what turned out to be a far larger data set: how Facebook stores data about what you were doing before posting that message, and what you do next.

The result? Facebook was capturing a terabyte of information about its users every day. A trillion bits of data. In 2007. Back then, Facebook was a tenth its current size.

This data creates a new competitive dynamic. First, it favors size: Lowe’s knows better than the corner hardware store what to stock because it has more data about what people want. CarMax can price used cars better than a mom-and-pop dealership because it has more data on what people will pay for, say, a 2008 Honda Odyssey. CarMax’s founder, Austin Ligon, called this information dominance.

More importantly for Redfin, this dynamic favors company-owned operations over franchises. Last Tuesday, I visited CNBC on the same day as the CEO of Coldwell Banker. We were both interviewed about the market. And the CB CEO should have known far more about it than I do: he has decades of experience, not just a few years; he manages an organization with tens of thousands of real estate agents, not a few dozen. In short, he has forgotten more about real estate than I will ever know.

But outside of calling one agent after another, the CB CEO has no way of knowing what his agents are doing; most work as contractors, for franchises, recording their deals in spreadsheets and notepads. Redfin on the other hand has a system for scheduling home tours and writing offers, which means we also have a system for storing data about every tour & offer. Months before the numbers are recorded at county courthouses or by federal agencies, we know when bidding wars are back, or when tire-kickers have taken over the market. We can see the whole elephant, and we’re minutely sensitive to when he’s about to roll on top of us or stampede through the jungle.

When Redfin was asked what would happen when the Commerce Department announced its numbers at 10 a.m., others guessed. We didn’t.

(Earlier we said that Coldwell Banker’s CEO writes a column for Inman News. This was incorrect. We apologize for the error.)

Discussion

  • Pingback: All Your (Data)Bases Are Belong to Us | Notorious R.O.B.

  • joewallin

    Glenn, you might like this, my favorite quant quote from the financial crisis: “Wednesday is the type of day people will remember in quant-land for a very long time,” said Mr. Rothman, a University of Chicago Ph.D. who ran a quantitative fund before joining Lehman Brothers. “Events that models only predicted would happen once in 10,000 years happened every day for three days.” http://online.wsj.com/article/SB118679281379194

    • http://blog.redfin.com GlennKelman

      Grey is every theory, green is the tree of life…

  • leosh

    When I listen to Jim Gillespie I can't help but hear Jerry Lundegaard on the phone with the financing folks… its really kind of amazing.

    • http://blog.redfin.com GlennKelman

      I love that movie LeoSh…

  • http://twitter.com/evanjacobs Evan Jacobs

    Our ability to collect data is quickly outpacing our ability to analyze and react to it. The solution to this problem is not just developing an increased capacity to store and parse data (e.g. hiring mathematicians and using Hadoop) but also by developing the skill to formulate hypotheses which can then be confirmed or rejected by looking at data. In other words, data is more than just a snapshot of current behavior. It should be used to provide the answer to “What do we do now?”

    As a simple example, Redfin might formulate the hypothesis that homes with more than one photo attached to their listings receive more tour requests than homes with 0 or 1 photos. Redfin could validate this hypothesis by logging the number of photos attached to a home when the tour request is made and then comparing the number of requests across similar homes. If the hypothesis turned out to be correct then the answer to “What do we do now?” is to figure out how to encourage more photos for each listing.

    • http://blog.redfin.com GlennKelman

      Couldn't agree more Evan. There's a tendency to ask questions out of curiosity rather than pragmatism. The key question was the one Lenin once asked: “WHAT IS TO BE DONE?”

      • Mosaic

        when the idle, abstract wonderings of a human mind find reality they become the sharpest knife. to climb the tree of knowledge using the ladder of reality is to reach only the lower branches. pragmatism, pure pragmatism, lacks creativity — but it is the creative interpretation, the weaving of a story from data, that will tell you what is to be done.

  • http://blog.findwell.com Kevin Lisota

    Glenn, I agree that the ability to gather and react to voluminous data about consumer behavior will transform the real industry for the better. Quite frankly, it simply makes it easier to run a volatile/seasonal business when that sort of data is available.

    Franchise brokerages, and all brokerages of any size, do actually have a considerable amount of data at their disposal. If you visit any well-run brokerage you would see that they have a well-oiled transaction management system in place. That ensures broker oversight, proper record keeping, etc. Every one of these brokerages knows how many accepted offers their agents have, how many pending, how many failed and how many closed. Such records are kept up-to-date, otherwise you don’t get your commission checks. Your data set at Redfin is more complete by tracking tours and unaccepted offers, but they have the same fundamental data at their fingertips and they certainly don't need to call agents one-by-one.

    Whether the larger franchises effectively roll up this data to the honchos in their HQ office, I’m not sure, but certainly at the local office level there is loads of data that can be used to track the pulse of the market. I think the more fundamental problem for real estate is that even if the CEO of Coldwell Banker had an uber-source of the latest market data from his agents, what can he actually do with it? He can’t affect change amongst a loosely-branded hoard of independent contractors.

  • http://blog.redfin.com GlennKelman

    Good point Kevin, I agree that a transaction management system provides each office with data, provided the agents use it. What helps us is that customers approach us via an online system so we can track the process from its earliest stages…

    • http://blog.findwell.com Kevin Lisota

      I'm not talking about individual agents tracking their sales pipeline through Top Producer or some other such nonsense. Walk into any traditional brokerage and ask the transaction coordinator and/or broker how many accepted offers their agents have this month, how many have failed and how many have closed. They do track this data fairly reliably because their paperwork retention policies demand it. They are able to get agent compliance under threat of withholding commission or even fines if you don't submit your transactions to the transaction coordinator on time. Your system is superior, because it captures more data early in the process, but any reasonable brokerage has a rich set of core transaction statistics at their fingertips and it is up-to-date.

      The question isn’t whether they have the data or not to track the pulse of the market. They already do. The question is whether they choose to review it, or if they are capable of taking any action on it because their agents are simply independent contractors.

  • http://www.saleboots.co.uk ugg sale

    Mark S. is definitely on the right track. If you want to get a professional looking email address, Id recommend buying your name domain name, like or
    discount ugg boots
    If its common it might be difficult to get, however, be creative and you can usually find something.

  • Pingback: Merchant Sensibility | Redfin Corporate Blog

  • Pingback: Engineers As Marketeers, Marketeers as Engineers | Redfin Corporate Blog

  • Pingback: Food, Shelter, iPad | Redfin Corporate Blog