Video Of The Week: Regulating With Data

Here’s a talk my colleague Nick Grossman gave at Personal Democracy Forum last month. We have been advocating for some time with anyone in government who will listen that we need to change the paradigm of regulation from yes/no to yes,if and the if is all about data. We call this new data driven regulation paradigm “Regulation 2.0”. Nick walks the audience through this thinking in this talk.

And here are his slides from the talk


Comments (Archived):

  1. Guest

    The last thing big-cos and lobbyists want to see is regulating by objective, open data

  2. William Mougayar

    What regulators/government need to help inch them towards this type of innovation is “regulatory sandboxes” where they can safely test out these assumptions.That takes extra money from their budgets, which many of them don’t have. The progressive ones are doing it. Regulatory sandboxes is a more prevalent practice in Europe than in NA.

    1. Nick Grossman

      one issue here I’ve discussed with folks in city government is that any one actor in govt (say, a city hall) doesn’t necessarily have enough control over a given sector / market to create a sandbox, since there’s often an overlapping regulatory regime of local, state, etc

      1. William Mougayar

        so, they could create collaborative sandboxes, no?

        1. Nick Grossman

          perhaps, but that introduces a different kind of political problem e.g., think about how much NYC and NYS typically squabble over everythingSent with Mixmax

          1. William Mougayar

            Well that’s an opportunity to work together under a neutral territory.

    2. Matt Zagaja

      I think the problem is not merely a jurisdictional omelette to unscramble, but experiments and sandboxes in government, at least in America, are challenging politically. Nobody wants to be pegged with responsibility politically if an experiment fails. One person’s experiment is another persons “government waste” hit piece.

      1. William Mougayar

        Not really. I’ve talked to US regulators who favor the sandbox approach to testing innovation as a way to prove or disprove the various outcomes.

  3. pointsnfigures

    I am with him on this, especially as it pertains to things like agriculture. Here is the rub. The aftermath of bad outcomes and the lawsuits that will ensue.

    1. William Mougayar

      That’s why we need regulatory sandboxes to test these new assumptions first and get the bugs out of that type of decision-making.

      1. Nick Grossman

        best bet for sandboxes is for certain cities / states to take the lead on specific issue areas — laboratories of democracy, etc

  4. sigmaalgebra

    Wow.Enthusiastic.The data sharing, data broker idea — cute. Maybe in some cases it will work.Homomorphic encryption — sounds a bit impractical. That is, the subject is so tricky that tough to believe can get much from it at all and, say, still have encryption. Or, what’s amazing about a dog that walks on just his hind legs is not that he does it well but that he does it at all; same for homomorphic encryption.Is data valuable? Not nearly always but sometimes yes, very valuable.Here is the biggest problem with the talk: There is the claim, at least the suggestion, especially from early in the talk, e.g., the one about how does ride sharing impact traffic in NYC, that with the data, and computing, that now exists we can answer these questions.Sometimes, yes, but, sorry, slow down — such successes are rarely so easy, even with a lot of data.So, here is some Getting Information from Data — 101, Introduction to Lecture 1, points (A)-(C):(A) We can get some data and take some averages, get a histogram as a bar chart, etc. Some questions can be answered that way.(B) We can do what Fred did in his post…on summer doldrums. He hypothesized that they exist. Then in a post in that thread, in…I took some related data of mine and did a statistical hypothesis test of Fred’s claim. So, maybe there is no effect such as summer doldrums, and a dip in the summer could be from just normal, random fluctuations in the data? Okay, assume so and then, from the available data, what is the probability of getting three months in a row with a total of such little activity? Found that the probability was only about 12%. So, either Fred was right or there are no summer doldrums and, from my data, what was observed happens only about 12% of the time. So, we reject the hypothesis of no effect and continue to entertain that Fred was right.The point: We did some statistical hypothesis testing. And, if we are to get very far with Nick’s claims, we will have to do a lot of that. Already some people find such applied math a bit tricky.(C) But Nick went on and started expecting to be able to find causality in the data. Generally, often, maybe in significant cases usually, that’s tough to do. Basically the research university departments of social science struggle terribly with many thousands of special cases of that problem.E.g., in physics finally Newton found his second law of motion, F = ma, that is, force is mass times acceleration. Or if want to know how to get a car of 4000 pounds to accelerate 20 miles per hour per second, e.g., 0 to 60 MPH in three seconds, need to know how much force need to apply. Then, with tire size, gear ratios, and various losses, can calculate the engine torque needed. So, we have causality — force causes acceleration. Can also use that and the law of gravity to predict the motions of the planets quite accurately for thousands of years. So, in physics, looking for causality worked great.Well, social science wants much the same and, thus, has a really big, bad case of physics envy. Envy? Yes. Success? No. Nearly never.Generally causality in the social sciences is super tough to find. Sorry ’bout that.My brother tried scientific approaches in social science. He tried psychology and concluded that there were two cases, (i) some interesting questions without scientific answers and (ii) some scientific answers for some uninteresting questions. So for his Ph.D., he changed to political science where he could be less scientific.(2) My wife tried sociology. One of her profs was J. Coleman, one of the experts and data analysts behind Brown v Board of Education. He pushed hard on the theme of mathematical models in sociology. Another of her profs was P. Rossi who pushed similarly. Both were elected President of the American Sociology Association (ASA). My wife’s Ph.D. dissertation was from some data Rossi collected.She addressed the question, as people rise in responsibility in bureaucracies, do they become more cautious? It wasn’t easy to get a solid answer.Both my brother and my wife were constantly struggling with what are good measures, how to know if have a good measure (e.g., reliability and validity for a measure), what is the data type, e.g., nominal, ratio, interval, what statistical models to use, e.g., continuous, categorical, linear, log-linear, analysis of variance, what hypothesis tests to use, e.g., parametric or distribution-free, how to use control variables, testing for independence, testing for significance, checking for spurious correlations, how to go from correlation to causality, etc. Definitely many cases of trying to nail jello to a ceiling, at least to a wall.From my brother and wife, and more, I learned that doing good science in the social sciences is generally super difficult.Yes, there is the claim that with enough data, lots of such questions become simple. Here is some of when that can happen: Just do a lot of cross tabulation. It’s possible to show that cross tabulation is a discrete approximation to essentially the best possible first cut analysis of the data. But for that approach, in general need a lot of data.Yes, the large quantities of new data in digital form are terrific, but the desired victories in meaning, information, causality, understanding, and prediction will need more work.E.g., just for my startup I derived some new applied math based on some advanced prerequisites. Among pure mathematicians, the prerequisites are less well known than one might guess; among computer scientists, still less well known; among information technology entrepreneurs, still less well known. And my startup is just one case — Nick was making a wide swing of his arm about thousands.

    1. Nick Grossman

      no question all of these challenges existthe real point i’m making is that we’re **already** regulating w data, just on the platform side and not on the government sidewhich has already won us a lot of new opportunity, in terms of trust, safety and securitythese same gains can carry over to the public side of regulation, if we learn how to bridge the divideclearly data brokers will be tough to get right and homomorphic encryption is an early technology — just two examples of ways we might think about approaching the problem

      1. sigmaalgebra

        Maybe you are correct, but after watching the whole talk and the beginning three times, I’m not understanding just what you are saying and driving at:In the past, regulation was usually based on relatively simple data and/or considerations and was for only selected parts of the economy and society.E.g., why do we have fire regulations for buildings? Sure, because of some of the horrible examples of fires in buildings before such regulations.Why do we have driver’s licenses? Sure, because a 4000 pound car going 65 MPH is dangerous, and we want to know that the driver is old enough, has passed a driver’s test, is sober, can see, etc.Why restaurants have licenses? Sure, because we are concerned about threats to public health from bad food.E.g., I found that as my startup goes live, in NYS I will have to tell NYS the name of my business, i.e., that I am “doing business as” in the name of my business. I’ll need a tax ID. And from what I’ve learned so far, that’s it or darned close. So, for my business, so far NYS doesn’t care. “Look, Ma, no regulation!”.Nick, for your view of the future of regulation, I’m not getting it. What new regulations? Why?Nick, you mentioned some cases where, with the new data, we can answer questions, e.g., find causes, that such information is “in the data” or some such. In some cases, okay. But, as I tried to outline, finding causes in social science data is usually difficult. Next, from what I’m seeing, the questions you were mentioning would be of more interest for, say, city planning and zoning than what is usually regarded as regulation.Next, I’m reluctant to believe that solving the sharing problem will make enough new data available to make enough valuable, new information available to be of broad interest. In some specific cases, sure. Broadly? No — getting valuable information on social science questions is usually too difficult.E.g., you mentioned something about curing cancer. Gads! Via NIH, foundations, companies, donations, etc., we have invested how many hundreds of billions of dollars in that problem? The work has been one of the crown jewels of civilization. There’s been some progress, lives saved, and I expect much more, in time, essentially real cures for nearly all cancers along with a big fraction of other medical problems. But much of the very best work in statistics is in bio-statistics and, there, no doubt in cancer research.Heck, in my post on hypothesis testing, I mentioned B. Efron and P. Diaconis, both at Stanford. From what I gather, at least Diaconis, likely also Efron, are heavily interested, maybe mostly interested, in biological and medical statistics. Also, Leo Breiman, long at Berkeley, student of M. Loeve at Berkeley, both among my favorite applied mathematicians, late in his career, especially in his work in classification and regression trees, was motivated mostly by problems in medical statistics. Actually it is fair to say that Breiman’s work is the main foundation of recent work on machine learning in computer science.So, there’s already a lot, much of the best, work in analyzing data to help cure cancer. I’m certainly not going to rush out and claim I know something big and new that will do a lot of good in that area that the best people there have missed.Nick, you may have identified some good opportunities, but for your broad picture I’m not understanding it.

  5. James Ferguson @kWIQly

    This has some parallels with sustainability and enterprise energy management issues.An energy manager bought in advance what s(he) guessed was needed – then settled for the difference when reality came home to roost.They undertook a handful of CSR friendly projects to show a) willing, b) efforts but with very little focus on c) results – because tracking energy is hard and the IoT (metering is the first case of the IoT) was difficult – and data quantities are overwhelming.Rapidly domain expertise is concentrating near the data hubs – ( the guys that manage energy data for energy retailers ) as it would and there is a two step sale to the end user (an enterprise or end-consumer), via the energy retailer.This means that lots of data flows back and forth, and entities (public private large small) can participate in energy consumption or production (via solar/ wind / virtual demand response) as peers.Managing grid balancing (is energy produced in an area consumed in that ares or must it be shipped at cost via the grid to somewhere / some-when (via storage) else) is becoming a peer-to-peer accountability regulated by a distributed market place.Wiser energy retailers are already figuring this out and as they become service provider rather than commodity suppliers a bloodbath of M&A will soon ensue. 🙂

  6. Brandon Burns

    I think data is the #1 tool we’ll have to combat modern discrimination. If there’s one thing I’ve learned over the years, its that everyone thinks they’re the good guy and very few people can pinpoint their own faults. We keep talking about the obvious problems, because it’s hard to deny someone being shot, but I’m a firm believer that it is all the day-to-day, non-obvious transgressions that make up the foundation of the discrimination in our culture.Furthermore, we’re a narrative driven culture, one that is more moved by a powerful anecdote about one individual or event, than we are by overarching trends and realities. But data, when presented correctly, can make those realities easier to understand, and can be just as moving.And I’m putting my money where my mouth is. Of course, as a black man, I start at a disadvantage; funding is a low-chance reality for me, so I’ve accepted that reality and am exploring other avenues to bring my product to life. But I also have the unique perspective and experience to connect the dots, and my own connections outside of the lily white world of tech, so there’s that.

    1. Matt Kruza

      So I will do my best to ask this and come across in the appropriate manner. We have interacted a few times, and most recently with your tennis analysis (yep seems like age slows you down.. too bad federer lost!). I am a middle class white man, have had some difficult things happen in my life that many haven’t but certainly can’t come from your perspective, and won’t try to. I do want to give my percepction, which may be wrong, and see what you think. May take on a lot of the discrimination is that the biggest reason it matters is because of the amplification of poverty and lack of wealth / economic movement. Let me try to elaborate. Obivous, overt, viscious discrimination (the horrors of say start of nation until 1960’s in different forms) have vastly decreased. There are still way too many biases in many segments of society, especially by some / many / most whites against black inidviduals andother minorities. But, from school and professional worlds I have seen that these are less when individuals have had success and have financial means (not talking wealthy, just making say $50k or more for example). It is the more subtle racism that is very deleterious to the family making $25k wondering if they can pay rent, can feed their kids, ever go on vacation etc. Without the financial stress, most of the subtle issues wouldn’t be as impactful, you could simply ignore / get past someone who is clearly in the wrong. The upshot of all of this is it is super hard to change bias and personall perception, especially in pubic policy. But if public policy could say close the income gap so in a generation (20 years from now) instead of average black family income being $35 k to $60 k for white, both are $80k, then I think few problems would exist. One criticism of the approach I see is that part of the problem to fixing wealth / income inequality is the bias / discriminatory views of those in power. There is some validity there, but again I think try to focus mainly on closing wealth gap and call out bias against that, instead of other bias that won’t change power / income analysis. Very long and semi-rambling, but I hope to get your perspective on this and I hope my comment is useful / or insightful in some way

      1. Brandon Burns

        You are correct in that I, as a well educated black man from a middle class home, I experience less overt racism than, say, one of my cousins who didn’t do as well in school and makes less money now. We live in different kinds of environments, and thus live different lives.But just because I may fare better than someone less educated doesn’t mean there isn’t discrimination. Wealthy, educated white people also do better than poor, uneducated white people. The better perspective from which to look at it is that, when all is equal socioeconomically, the black person is still going to be worse off than the white person. On average, a white Harvard graduate will do better than a black Harvard graduate. It doesn’t matter if the black guy still has a good life, it’s still one that is unequal to his peers.In the gentile sphere of higher education and corporate America, folks don’t go around shooting others, but the discrimination runs just as deep.There was a great quote from Chris Rock’s Oscar awards monologue. Something along the lines of, “Hollywood isn’t burning cross racist, it’s sorority girl racist. It’s, ‘Oh, no, we think you’re great, Rhonda! You’re just not one of us!'”And really, that’s just human nature for you. The point is to, instead of ignore your natural bias, to catch it and adjust.I spent a year working in China, in the Beijing office of an American based multi-national. The staff was half Chinese and half foreign, with the foreigners all fluent in English, and the Chinese having just okay English on average. So, naturally, I spent more time with my non-Chinese colleagues. Not only was it easier to physically talk to them, I related to them more. So when it came time to kick off a new project, in the beginning, I’d usually opt to work with the people I new better. But the unintended side effect was that I had basically excluded everyone who was Chinese from that inner circle. Which is easy to do when you’re in a position of power, and the marginalized people haven’t made you aware of the asshole you’re being.And I had my “reasons,” too, mainly that the Chinese copywriters and designers on my team weren’t as good, and the truth is that they weren’t. But I was a Creative Director, and that was my team, and being a black person who had been overlooked in a similar way many times in my own country, I had to catch myself and rectify it. Also, my boss, an amazing Austrian guy who’s probably the most globally aware person I’ve ever met, was relentless when it came to making sure that everyone worked together well.So, I started to take time out to coach team members on skills that I felt were lacking. Did more team building things. I hosted a weekly party at my apartment, and invited the whole staff. And you know what? Eventually, both interpersonal relations and the quality of the work got better, work that was covered in international press and honored at the highest levels in the advertising industry.The point is this: we all have natural biases, and that’s okay, in the sense that it’s human nature and that’s never going to go away. But we have to recognize when those biases leave a whole race or gender left out. And then you have to do something about it. Because the end result is better for everyone.

        1. Matt Kruza

          Appreciate the response. I think we are in decent agreement. You stated it better than me, but it seemed like once you were on equal work footing with the chineese counterparts (working at same company in similar roles as the proxy for same economic level) you cultivated deeper relationships and much of the bias, if not as much as possible, went away and then everyone was in the same group. Definitely think this will occur much more once the underlying economic circumstances are more levelized.

        2. Chimpwithcans

          Brandon, that’s a great story. For what its worth, your ‘BBBEE is net positive legislation’ comment from yesterday has really stuck with me like a thorn in my head all day. And I agree with it totally. I am impressed by your eloquence on this matter. I will think more before i open my mouth. I genuinely want to show you Kenya (where I am from) and South Africa (where I live) because I think it would surprise you in many ways – both positive and negative. For now I guess these valuable online interactions will have to do.

          1. Brandon Burns

            Kenya is on my shortlist!

      2. JLM

        .You are at the wrong end of the race track. Focus on the beginning, not the finish line.We need a society with equal opportunity, not equal outcomes.We are promised life, liberty, and the pursuit of happiness. Pursuit, being the key word.We are not promised equal outcomes nor should we be.JLMwww.themusingsofthebigredca…

        1. Brandon Burns

          “We need a society with equal opportunity, not equal outcomes.”Perfectly said. I might have to steal that!

        2. Matt Kruza

          You have to be focused on BOTH sides of the race track. Obviously you want equal opportunity, but that will never be 100% reached due to past differences / inequities, so its important to work / look at the outcomes as well, otherwise focusing purely on the opportunity becomes a farce to not really dealing with how there really aren’t equal starting grounds. Its a multi-faceted issue, which I am sure you know, but looking at both over a long time frame is appropriate.

          1. JLM

            .I was alluding to the start v the finish.We can control the starting gate but we cannot control either the progress of the race or the nature of the finish.We are not responsible for how a person runs their own race. That is up to them.JLMwww.themusingsofthebigredca…

    2. Matt Zagaja

      When I was at OpenVis someone showed off the following project they made: which I thought was great. I am a bit skeptical of the hidden racism hypothesis, however. When we look at things like AirBnb data and the fact that people of certain races are less likely to have their offers to stay at person’s place accepted, I don’t believe the person who rejects them is unaware or confused about the fact they are rejecting someone and why.

    3. sigmaalgebra

      You are addressing some questions serious for the US. Getting some solid answers, e.g., finding causes to guide interventions to get the desired changes, for most of the questions stands to be darned difficult.The best research university sociology departments are not interested in socializing, ice cream socials, socialization, social problems, social work, social change, or social justice but in solid science for understanding groups of people. It’s tough to do, but for power and effectiveness, solid science usually totally knocks the socks off everything else.As usual in science, such sociology tries to be mathematical, and that usually works out to be quite statistical. And such sociology is also, say, like biology, heavily about observation and experiments, controlled or otherwise.As soon as someone says cause about a problem in social science, it is time for some skepticism, that is, a grain of salt, that is, time to shovel in a big bucket of salt.But, the usual successful approach to finding causes is to have a lot of intuitive understanding, guess the cause, and then scientifically test the heck out of it. So, good intuitive understanding is not the end but it is usually crucial for a good beginning.Sure, as you mentioned, for some simple descriptive statistics on some good data, can make some progress in public attitudes and maybe more. But even there being solid is not so easy.E.g., one might respond:”Of course it seems that too often we have white police officers shooting or otherwise killing black men.”Then this observation is supposed to suggest that the cause is attitudes of racism by white police officers toward black men.I tend to agree with that, but one might respond:”Of course, but, just look at the population in the jails — highly dis-proportionally young black men. So, bluntly, too many young black men are criminals. So, no wonder too many get shot by police. And by white police? Most of the police are white. So, it’s not racism but just that too many young black men are criminals.””Since that’s the cause, what is the solution? Sure, lock them up.”But as we see, that doesn’t work very well either.So, net, in the public policy, we don’t really know what the heck to do. One reason is, we don’t have relevant, solid social science.If you would like to pursue these questions seriously, especially with solid science, you might find that some high end sociology departments would like to have you as a grad student.Warning 1: In social science, doing solid work is not easy.Warning 2: Even if you found some really good and relevant science, getting that implemented in public policy might be darned difficult. It might be easier if social science had as much respect as math and the physical sciences.

  7. Ana Milicevic

    Great presentation and I like the framework @nickgrossman:disqusA significant challenge to moving regulatory frameworks forward may prove to be relatively low data literacy among our current stock of elected officials. This extends to technology all together, and is especially manifested around newer forms of tech in which I’d lump everything from drones through surveillance tech. We also don’t have a modern legal framework as the Apple vs FBI case demonstrated, where the basis of the government’s case lay in a precedent dating back to 1911 (and earlier). The talent and expertise needed resides in the private sector; perhaps the most immediate question is how can various governments ensure they’re tapping into the right expertise to help craft policy and regulation?I consider the ability to interpret data and perform basic analysis a critical skill for citizens and it’s one we largely remain underprepared for as a society.

    1. Matt Zagaja

      Fortunately the solution is simple: hire the talent. It just turns out that hiring people costs money and government raises money with taxes and people do not like taxes.

      1. Ana Milicevic

        I’m not sure government hiring is the right way to tackle this – most cross-industry initiatives are handled by industry bodies where members of different companies participate usually pro bono (meaning their participation is covered by their usual private sector paycheck). That kind of model could work well here too.

        1. Matt Zagaja

          While I think it’s important for stakeholders from industry to be involved, there also is an issue of loyalty and interest conflicts. Some of the people regulating need to be independent if the regulations are going to have the confidence of the public.

  8. SubstrateUndertow

    If data is the new currency in the world of commercial food fight success doesn’t data/mining become the new IP that drives everyone’s profits and thus universally prioritizes proprietary control as anchored in the bedrock of volitional self-interest ?That seems like a very organically distributed barrier against usefully shareable regulatory data-feedback structures !I’m just talking through my hat here as I have no expertise in any of the pertinent disciplines at play, only my personal fascination with biological living-systems analogues.It does occur to me that applying some sort of neural-net driven adaptive Quorum Sensing regulatory structures across all social/commercial networks by providing for shared anonymized data-input as an opt-in process driven/controled by end users across all their Apps might hold some promise?The only regulation required is the right for all end users to capture and freely/optionally reshare any/all of their anonymized personal App-data as they see fit.

  9. iggyfanlo

    NickWe’ve just soft launched a consumer data union… in essence an agent/broker for consumers and their data and those mining it and profiting from it.. I’d love your feedback… and the community’s

  10. creative group

    William Mougayar:How do you think the future of miners will continue until 2041? Less incentive to mine on each split.————–News that promoted this question:Bitcoin, the digital money, is set to undergo its second-ever “halving” event today.Hard-coded into cryptocurrency’s rulebook is a law that cuts in half the value of the digital payout that so-called miners receive for supporting the network with their computing power. The event happens about every four years. (More specifically, after every 210,000 blocks of transactions are processed.)(Source: Fortune’s Data Sheet)

  11. thinkdisruptive

    Assuming you could create an efficient global database that collected relevant data (if it isn’t relevant, it’s next to useless) about a problem that I wanted to analyze and regulate, there are still two major issues which you don’t address.1) If the data source can’t be traced and audited, then the data cannot be considered reliable/trustable. This is particularly important for regulation. We barely trust bureaucrats and elected officials today — why would we trust them more in a world where source data is encrypted and I can’t validate and verify the basis for any decisions or regulations that result from it? A real world example: red light cameras. City officials, both through deliberate actions and calibration carelessness set cameras to shoot pictures of license plates before cars even entered the intersection, resulting in huge numbers of people falsely cited with moving traffic violations (and many cities being forced by their citizenry to de-install expensive cameras as a result). Any data where there is an incentive to the regulator to manipulate the result is guaranteed to have similar outcomes, and there will inevitably by unauditable innocent mistakes in other outcomes. Encrypting source data used for regulation, which by its very action diminishes transparency, is a really bad idea.2) There is an implicit assumption that data is a public resource — i.e. it has no owner. This is a fundamental privacy issue already, as companies collect and aggregate tons of personal data about individuals, and license and sell that data to others without the consent or even knowledge of those affected. That data properly belongs to the individual who should have the right to decide how and whether to share it or not. Despite the prevalence of social media, many people still choose not to participate precisely because government has not done enough to protect privacy and defend the rights of people to control how data they choose to share is used. Also, by not protecting privacy and data ownership rights, government is creating the opportunity for massive security breaches that will make stories like the hacked databases of Target and Home Depot of the last couple of years look like very small potatoes.I would suggest that the biggest obstacle to data-based regulation is not technological, but far more basic issues of trust and ownership. Until those are resolved, or built into a technological solution, this is just a pipe dream.