One of the unique things about early stage investing is the ability (and in my view, the need) to continue to invest in the companies for multiple rounds of investment.

Late stage, public market, private equity, real estate, and most other popular forms of investing typically involve a single or a time limited series of investments.

But at USV, we typically will make four to six investments in a “name” over five to seven years.

And we do this style of investing with a fixed pool of capital.

So we have gotten very analytical about modeling out our reserves for our follow on investments.

What we do is maintain a spreadsheet of every investment in a given fund and the likely amount and timing of future follow on investments as well as the probability of us having the opportunity to make those investments.

We then run a Monte Carlo simulation 1000 times and draw a distribution curve of outcomes and then manage our funds against that.

We have a “cushion” for error which is our ability to recycle roughly 20% of our funds and that has come in handy on every fund we have managed at USV.

I think the proper allocation of follow on capital into the portfolio and making sure you can follow your winners and defend your position in certain situations is absolutely critical to producing top tier returns.

It is not as important as portfolio selection (which comes from our thesis) or our work on the boards of our portfolio. Those two things are the most critical factors in our performance.

But I think capital allocation/fund management is third on the list and is a missed opportunity for many early stage investors.

#VC & Technology

Comments (Archived):

  1. JLM

    .First rate blog post.JLMwww.themusingsofthebigredca…

  2. Mark Gavagan

    Can an investor with pro rata rights, but without the means or appetite to invest in a subsequent round, sell those rights to another party?

  3. awaldstein

    This implies that there is a minimum fund size that makes sense below which both support of the VC and follow on investments don’t make a lot of sense.Thoughts on size of funds from this perspective?

    1. pointsnfigures

      True. Fund USV has a $125MM fund, plus an Opportunity Fund to follow on. Many of the larger VC funds have that. As a microVC, we reserve for Series A, but know full well we aren’t playing in later rounds.

      1. awaldstein

        Thanks!You have a minimum size that makes sense math wise?For follow-on and I presume for support of the VCs as well as that needs to come from management fees.

  4. Jay DeVivo

    Thanks, that was very interesting. I’ve never heard an explanation beyond “about 50%.”Since your investments are all around a thesis (and therefore not “random”), have you seen periods of high correlations such that you couldn’t investment as much as you would have liked/planned because more portfolio companies were hitting the marks for later rounds and needed capital, or does the ability to 20% recycle take care of that?

  5. jason wright

    “…and defend your position in certain situations…” – don’t you have an emergency contingency ‘in time of war’?

  6. sigmaalgebra

    If you have enough data to do the Monte Carlo, then you have enough data to do the best decision instead of the Monte Carlo. It’s a stochastic optimal control problem and has a best possible solution and algorithms to find it.The “manage our funds against that” is a mud pie instead of a diamond.Definitely you are leaving a big pile of money on the table, USV and likely every significant venture firm in the world.

      1. sigmaalgebra

        NO! Wildly NO. $billions of times NO, NO, NO.Look: A quarterback does NOT call all four plays on first and ten based on some average values and, instead, calls each play after seeing the field position, things about the other team, etc.Moreover, although quarterbacks can’t much hope to do this, a quarterback might call the first play based on what happens in the third play when the second play has been called after the results of the first play. Even that is just trivial, baby talk on best that can be done.Broadly (1) we want to make each decision exploiting all the information we have when the decision is made; i.e., we can see the past and present but can’t predict exactly random parts of the future (and having a good estimate of some expectation, i.e., average, about the future is NOT enough), (2) generally if we can do Monte Carlo we can do (1); (3) doing (1) is the best that can ever be done; (4) in practice how to do (1) is, for the computing resources, from not trivial to beyond taking over all of the Amazon server farms for a year+. This is all based on some theorems and proofs, not all easy, in the theory of stochastic optimal control. Disclosure: I wrote my Ph.D. dissertation in that field.The field is for best decision making over time under uncertainty, and the “best” is true in the strongest possible sense — I have such a theorem in my dissertation.I know; I know; in 1940 experts in the design of iron bombs and artillery shells would not have believed in Fat Man and Little Boy, either.There is more than one way to look at the field, but here is a simple view of the first, usual way, correct in theory but some changes might make the computing faster.To make the description easier, let’s use discrete time instead of continuous time. So, maybe we will make decisions once a month, week, day, hour, minute, …, pico second, but still discrete and not actually continuous.And to make the problem easier, we have a fixed, known time for the end of the problem. So, we know how many decisions, say, once each minute, between now and the end time. So, let the number of decisions be n. We don’t make a decision at the end and instead just go to the “pay window” (JLM) with what we do have then. So there are n + 1 points in time, say, i = 1, 2, …, n, n + 1; we make decisions at the first n of these; and the last point is the trip to the pay window.So, since we know so much about our problem, say, enough to do Monte Carlo, we go to time n + 1 and write down ALL that can be the case, ALL the combinations. E.g., if we permit 50 investments with the most valuable worth <= $10 T, then, with investment values in discrete pennies, we have (1000 * 10^12)^50 cases, that is, possible states. Did I mention we could use a lot of computing? So for each of these states, we, as we can, easily enough, write down what our trip from that state to the pay window will be. To look ahead, we are trying to maximize what we get at the pay window or, in the presence of uncertainty, its expectation. But we don’t have to worry about randomness or expectations now; instead, we just have all those possible states and for each one note what it is worth.Then we back up in time to decision n. There we consider all the possible states, say, a little less than our (1000 * 10^12)^50 above.Then we pick one of the states and consider all the decisions we might make. For each decision, we make it, apply the randomness, i.e., essentially in the sense of Monte Carlo, and get all the states at point n + 1 we might get and the probability and pay window value of each. So now from all the states at time n + 1 and the value and probability of each we have a distribution of results to the pay window — we take the expectation which with some common assumptions (about utility functions) turns out the best thing to do.We repeat this for all our possible decisions at decision n.From all those decisions, we note the one that maximized the expected value at the pay window and for that state at decision n note that value.At decision n, we repeat all this for each of the possible states at decision n.Then we back up to n – 1, rinse and repeat, …, i = n, n – 1, n – 2, …, 1, all the way back to decision i = 1.Then in practice, to apply all this, after Bonneville Dam refills after we drained all its water running Amazon’s server farm, at i = 1, we know what decision to make: I.e., we just note our current, initial state and look up the decision we found for that state. So far we do NOT know what decision we will make at decision i = 2, but when we get to i = 2 we just note our state and that we have already calculated what decision we would make at that state at i = 2.Then as the actual time passes, for each decision, we just look up our state and the decision we have already determined for that state.There are some assumptions to be made, but likely the Monte Carlo is making those assumptions already.From the law of large numbers and some common utility assumptions, maximizing the expected value really is the thing to do, but we NEVER try to make an ‘average’ decision ignoring state, no more than in football.Take any other means of making decisions in terms of only what is known when the decision is made, for each i = n, n – 1, …, 1, go through essentially the math as outlined here, and see that for each i the other means of making the decision can’t do better.Writing out all of this, with careful mathematical notation, with careful use of some assumptions, especially conditional independence, and tap lightly with the classic Fubini theorem and done. Uh, there is an issue having to do with a tricky topic called measurable selection, but we can leave that to the second grade version!So the math I outlined is the best possible way to make the decisions.There is work on getting the same results but making the computing go faster. E.g., in my dissertation the simple approach looked like it would run for 64 years, a bit too long for a Ph.D., but I had some mathematical ideas, did write the corresponding software, and got the time under 200 seconds.It is interesting to note that, really, standard spreadsheet software is a good, at least conceptual, user interface for posing such problems. So have n + 1 columns and one row for each state variable. Set up the spreadsheet for Monte Carlo. Then take THAT spreadsheet as a formulation of a problem in stochastic optimal control.At one time I proposed to IBM actually to do that. On small problems, it could run nicely fast. On larger problems, sell some cloud computing. Then work on more means to make the computing faster, including some means of approximation. Here multivariate splines come to mind. Also in approximation there is a role for neural network data fitting.I got back only laughs.But now apparently there are at least two fairly serious applied research efforts for essentially what I proposed. So, currently this problem is cutting edge research.The first good objection is that we really don’t have data good enough for Monte-Carlo. But if we DO believe we have data good enough for Monte-Carlo, then, as I outlined, likely we have data enough for stochastic optimal control.How good the data for the Monte-Carlo has to be so that the results of the corresponding stochastic optimal control will yield really good results in practice is likely TODO research. How to do the research? The usual way, start with a simple case and try to get research results there, get some hints about the general case, and pursue that.Yes, Black-Scholes is a really simple special case.So, as described, at each of the n + 1 times we have some possible states and for each of those states at that time we have a decision. Call that our decision rules. Maybe getting those rules we made some approximations. Or maybe we just want to check our work.So, right, using those decision rules, we can do a Monte Carlo evaluation, say, 500 trials and take the average.So in this way we can compare these decision rules with whatever else we might have done.A really nice feature of having all this data and assumptions is that we can do such comparisons. And, if we didn’t make any approximations or errors, we can see the difference in expected value between our decision rules and anything else: We can expect that for relatively complicated problems, decision rules from stochastic optimal control will be much better, i.e., on average make much more money, than any much simpler alternative. We can see the difference in numbers which we have to believe in since we did believe in the data and assumptions of the original Monte Carlo approach.So, you are saying that what I described above is what Fred has been doing? Amazing.

  7. Mike

    Makes sense. You don’t get paid until the exit so the more you can control and direct through to this outcome the better off you are? This probably applies to many other things outside of early stage investing.

  8. Tom Labus

    This does hold true for the public markets also. Being able to build on and protect a winner is a crucial component of hitting one out of the park. Too many funds can’t do this for whatever reason.

  9. Michael Aronson

    Any chance you could post a simplified example of how you look at this?

  10. Richard

    There is a easy solution to this. Since all the gun action is males 16-24 and since all 16-24 are included in their parents health insurance, require all males in this age group to undergo mental health – screening and classify each according to a risk score.

  11. LIAD

    “probability of us having the opportunity to make those investments” -curious ways in which you wouldn’t have the opportunity – is that made up exclusively on the founder welching on the pro-rata rights?

    1. Richard Carlow

      Isn’t it more likely that the company is simply not raising more capital?

      1. LIAD

        im confident the overriding majority of companies USV back go on to raise (multiple) additional rounds of funding

  12. Rick Bashkoff

    What does recycling a fund mean? “…our ability to recycle roughly 20% of our funds”

  13. Adam Steiner

    Do you ever have scenarios where a company (“Company A”) in fund (“Fund A”) needs an additional investment where Fund A is basically done with the investment phase (even with the recycling). I always wondered if that happened and how you manage the potential conflicts in doing a follow up investment through Fund B or an Opportunity Fund

  14. KB

    Very informative. Thank you