Apr 6, 2019

Video Of The Week: AI and Society

Earlier this week, Kara Swisher interviewed Kate Crawford and Meredith Whittaker, who run NYU’s AI Now Institute.

It is an interesting and thought provoking discussion. I don’t personally love Kate and Meredith’s answers on how society should be thinking about these issues. They feel very “20th century” to me.

But regardless of what you think about their particular take on the issues, I do think we all ought to be paying a lot of attention to AI and its impact on and role in our society. It is important.

#machine learning #policy

Comments (Archived):

kidmercury Apr 6, 2019

I think AI is poorly defined and that is the source of most of the problems. Companies like Tesla that get in trouble with AI are doing so primarily because they are trying to get more out of the technology than what is safely possible (or are over promising what can be done). To me it is clear that they should this assume the liability, and should not be able to use “the computer did it” as a legal excuse. If this rationale is accepted it would solve much of the legal and moral disagreements that are surfacing around AI.AI is mostly statistics on steroids. Stats are not a problem and no one would ever say “it is not my fault, it is math’s fault” in response to a statistical error.
1. Richard Apr 6, 2019
  
  Pretty close – there are an infinite (most unknown) unknown functions that can fit the data. If the data set itself is infinite in size, you can never know with certainty they you selected the best function.
  1. kidmercury Apr 6, 2019
    
    Yes, but then you shouldn’t over promise in light of what is unknown. And if you do, that error is on you.
  2. sigmaalgebra Apr 6, 2019
    
    If for some positive integer n have, for i = 1, 2, …, n, pairs of real numbers (x(i),y(i)), graph them in the usual way, and want a curve to fit them consider, say, let me think a little …:Okay, assume that the x(i) are distinct. For some easier notation, for i = 1, 2, …, n, define polynomialr_i(x) = (x – x(1)) (x – x(2)) … (x – x(n))omitting factor(x – x(i))and define real numbers_i = r_i(x(i))[Here the “_i” is TeX notation for a subscript of i.]So we notice that r_i(x) is 0 at x = x(j), j = 1, 2, …, n except r_i(x) = s_i at x = x(i).That is, more intuitively r_i(x) really is a polynomial and is 0 at all the x points except x(i) and there has value s_i.Also since the x(i) are distinct, s_i is not zero — which means we can divide by it.So, polynomial r_i(x) is cute: It’s zero, makes no contribution, at all at the X axis data points except its particular one x(i).Then define polynomialq_i(x) = y(i) r_i(x) / s_iPresto, bingo, this little puppy polynomial is zero at all the X points except at x(i) is just what we want, our given y(i). We’re getting warm!!!Finally let polynomialp(x) = q_1(x) + q_2(x) + … + q_n(x)That is, we just add up the little puppy polynomials, one for each of our given points. Each little r_i(x) puppy polynomial does its job, gives value y(i) at x = x(i) and at the other points does nothing, gets out of the way, is zero.Done.I figured this out soon after I first heard about the interest in curve fitting. Then I programmed it and discovered that mostly it is a total riot, commonly between the given X axis points heads off for plus or minus infinity, turns around just in time, DOES go through the next point, and then goes wack-o again.So, it’s NOT very useful! As interpolation it’s sick-o!Later I discovered that long ago Lagrange had had the same idea, and the polynomial is known as Lagrange interpolation. Lagrange DID have a lot of better ideas!A better idea is spline interpolation. And it’s also easy enough to do least squares spline interpolation. And can do multivariate spline interpolation, useful, e.g., in the optimal value function of stochastic dynamic programming and maybe in computer aided design, e.g., getting the sheet metal over the rear wheels looking like the bottom of a woman!
2. sigmaalgebra Apr 6, 2019
  
  And nearly all the statistics is shaky. And the statistics is based on probability, and even with the best mathematical foundations of that there is at least one place where have to swallow hard.
  1. Pete Griffiths Apr 7, 2019
    
    This is a little vague.
    1. sigmaalgebra Apr 7, 2019
      
      Newton did calculus. By 1900 or so, the work of B. Riemann and others, e.g., E. Borel, had calculus as careful theorems and proofs.But what Newton/Riemann did is from awkward to inadequate for a lot that was wanted from calculus, especially in theorems, e.g., Fourier theory. So, near 1900, H. Lebesgue, student of E. Borel, redid calculus, especially the integral, and got what we now call measure theory.So, for sets A, B and function f: A –> B, Riemann wanted both A and B to be the set of real numbers R. Lebesgue also wants B to be R but made A much more general, just a measure space, essentially any set with a reasonable definition of area. Right, R is a special case.Right off, we can take nearly all the books in probability, statistics, physics, and engineering and toss them because they integrate f: R –> R, that is, integrate from minus infinity to infinity. Even the work of Riemann, Borel, etc. didn’t explain how the heck to do that. And in some tricky situations, you can’t do that.Can see much of the problem with just infinite series: Pick a series that has both positive and negative terms. So, maybe in some sense the series converges. I won’t look it up now, but consider the series a_n = (1/n)(-1)^n. As we know the series just 1/n diverges, that is, as we add up the terms grows as large as we please, intuitively grows to infinity. Well, I won’t derive it now, but IIRC this little a_n puppy, due to the way the -1 alternates in value as n varies, appears to converge. If so, then there is a cute result, intuitive enough, that if we rearrange the terms of a_n we can get the rearrangements to converge to anything we want. In the rearrangements each term does get added in eventually, but the order the terms get added in changes. Bummer because we don’t really know what the heck the series a_n converges to. Essentially the same for integrating functions from minus infinity to infinity — we don’t know what we’ve got; in more detail we have to integrate over some intervals of finite length, and let those lengths go to infinity and see what limit we get. But depending on how we let the lengths go to infinity, we can get anything. Bummer.Well, in practice those books do get away with it, but the books still don’t say when integrating from minus infinity to infinity works and when it doesn’t. Well, as just a small part of his work, Lebesgue cleaned that up: In short take the function f and define two more functions, f^+ and f^-. The function f^+ is the positive part of f, that is, is the same as f when f is >= 0 and otherwise 0, that is, when f is negative, f^+ is just zero. Now just ask in some relatively simple ways, even much like Riemann, what is, intuitively, the area under f^+. Then we look at the negative part of f and do the same. If both parts have finite integrals, then we are on solid ground and can define the integral of f which will be finite. If just one of the two is finite, then still we can be not so badly off. If both integrals are infinite, then we punt and say the function has no such integral. The Riemann work didn’t do this, and that’s a bummer.Now essentially always in those books, if just reinterpret the integrals as Lebesgue integrals, then get back on solid ground.So, in integrating f: A –> R, Lebesgue freed up A to be but a measurable space e.g., if you wish, all of R. Again Newton, Riemann didn’t do this.Then in 1933 A. Kolmogorov saw that the areas, measures, of the Lebesgue theory could be used as probabilities, could provide a mathematical foundation for probability theory. So, a probability was a flavor of area.So, we go into a lab, or just outside, or even just stay inside, look at something, and get a number. That number is the value of a random variable from one trial. That random variable is also a function [with domain, the set A above, the set of all experimental trials, of the kind that Lebesgue can integrate], and the Lebesgue integral of that random variable is its expectation. If we just ask what is the probability the value we observe is, say, greater than 10, that, too, is one of Lebesgue’s integrals.So, net, via Kolmogorov, the things we need in probability theory we get from Lebesgue’s theory.In particular, for that trial we did, surprise, that’s the only one we ever do! With this theory, all we ever see in the whole universe and all our time in it is just some one trial, some one point in the set that Lebesgue integrates over. So, we have to imagine all those other trials. Swallow hard here.Where the elementary approaches to probability talk about trials and n samples, the modern theory has n different random variables. Then for “simple random samples”, the modern approach uses a good definition of independence and says that the random variables are independent. It goes on this way — the modern approach and the old elementary approach each have their definitions for essentially all the same parts and pieces.But a shocking part of the modern approach, have to swallow hard here, is that all we ever see is just some one trial. And, yup, all the definitions, probability of the observation being greater than 10, expectation, independence, correlation, distribution, etc. depend on trials we will never do or see. Swallow hard here.For statistics, it uses independence like water with rarely much justification. Shaky stuff.For a little more: (A) Consider a f: R –> R so that f(x) is 1 when x is rational and zero otherwise. Well the Riemann integral of f on the interval [0,1] does not exist, but the Lebesgue integral does exist and is zero.Having a good answer to the integral of f cleans up some loose ends in some cases of convergence of functions and their integrals.Such a function f: R –> R has a Riemann integral on a closed interval of finite length if and only if the function is continuous everywhere except on a set of Lebesgue measure zero.For convergence in vector spaces of functions, asking for so much continuity is asking for a lot, for too much. So, we really like the Lebesgue integral.Oh, by the way, for the functions Newton was integrating, both Riemann and Lebesgue get the same numerical result.In a nutshell, Riemann partitioned on the domain of a function and Lebesgue partitioned on the range. Turns out, Lebesgue’s approach is better.
      1. Pete Griffiths Apr 7, 2019
        
        Hmm.Well I did ask.NowI have to think.
3. JamesHRH Apr 7, 2019
  
  Fuckin’ Eh Kid.It’s like Y2K.
Guy Lepage Apr 6, 2019

The folks at IIW have been working on this… “This” being reputation for about a decade. They host a bi-annual event at Google’s Computer History Museum. I’m shocked that they have not attended the event with all of the other folks from Google, IBM, Microsoft, Mozilla, etc., etc…Reputation is damn hard and it scares me the most about AI. We need more folks focusing on building reputation systems. I personally see that as the big threat with AI.
sigmaalgebra Apr 6, 2019

Looking at the picture, my first reaction HAD to be:Was is besser als ein Mädchen? Drei Mädchen!?What is better than one woman? Three women? Right, Albert?Yesterday for my startup got my Dutchess County, NY paper on “Doing Business As” (DBA — which at first I responded with “that means data base administrator?”) and my IRS EIN (Employer Identification Number) and stepped up in the world and got a Chase account! At one point in the paperwork there were three women looking at some computer screen considering some comma or some such!! So, to tease the women I blurted outWas is besser als ein Mädchen? Drei Mädchen!?And I added “Not only are they pretty, they are smart!”It used to be back to college the girls/women commonly gave a nice smile at being teased and told they were pretty and smart. I thought that they liked the attention, but maybe not, maybe they were just acting? Ah, I don’t want to be as slow as Biden to catch on!!!But, gee, it would seem so boring, to the point of an insult, to react to them no more than to the furniture.Okay AVC women, what is it?Here is a MUCH better case ofDrei Mädchen!:the end of the Richard Strauss Der Rosenkavalier as athttps://www.youtube.com/wat…A common claim is that Richard Strauss was really good at writing for the female voice — NO JOKE!!!!!!!! REALLY good!!! Definitely WAY up in the UNbelievable, how’d he do that, how could there ever be anything like that, in the drop dead gorgeous category.Now, women, you definitely do NOT want to be treated like the men!!! For that you will have to know about rear axle gear ratios, transmission torque converter lockup, supercharger inter-cooling techniques, the clips used to keep side windows in their tracks, throttle body fuel injection, sequential port fuel injection, and direct fuel injection, what happened to Lebron’s career, the heap data structure and the Gleason bound, network address translation, SMTP, SNMP, certification authorities, attribute control lists, capability inheritance, content delivery networks, how a guy in a bathrobe, Commander Rochefort, was much of the key to US’s winning the war in the Pacific in WWII, the differences between Fat Man and Little Boy, how electromagnetic pulse works, and MUCH more.On AI, as the AVC audience knows, I’ve been there, done that, at IBM’s Watson lab, led our work with GM Research, delivered our paper at the AAAI IAAI conference in Stanford, published various papers in AI, and for the main problem we were solving with AI did some original applied math based on some advanced pure math prerequisites (not related to the math in my startup) that totally blew out of the water and blew the doors off the AI work and published that in Information Sciences as sole author.My view of current AI is that mostly it is just some applications of old curve fitting, especially regression analysis going way back, but done MUCH less well than in, say,C. Radhakrishna Rao, Linear Statistical Inference and Its Applications: Second Edition, ISBN 0-471-70823-2, John Wiley and Sons, New York.and otherwise is some novel uses of huge amounts of data that in practice are rare.Moreover, the work on the huge amounts of data is really short on anything like a solid foundation, is heavily heuristic, lacks important statistical properties, and generally is tough to reproduce.The work I did that blew the doors off AI before would do that again now, remains by a wide margin the best work for behavioral monitoring of server farms and networks.The real situation remains: We get data, read it in, manipulate it, and report results. The manipulations are necessarily mathematically something. For more powerful manipulations for more valuable results, proceed mathematically, at times with original work in applied math based on some appropriate pure math prerequisites. The profs, students, and workers in the computer science community are very short on the math and are floundering around.The current AI situation has a lot of hype. For the work it appears that there is good and new, however nearly all the good is not very new and nearly all the new is not very good.The media world needs exciting stories to tell, and people with hype want their story told. But the media hype stories usually don’t last very long.In particular more than once in the past the AI hype flopped and resulted in an “AI winter”. E.g., history shows that in the early days of vacuum tube computers, some of the hype was about “The gigantic IBM electronic human brains.”. Uh, IMHO computer science has still made zip, zilch, and zero progress toward “human brains” or puppy dog, kitty cat, Peregrine Falcon, fox, wolf, bear, crow, horse, penguin, seal, dolphin, orca, or whale brains.Want to do some applied math curve fitting with, say, some significant properties? Okay. Select and work carefully and maybe can make some progress. But there’s much more that can be done, in optimization, stochastic processes, optimal control, dynamical systems, applied probability, …. Generally need to be close to or working with well done definitions, assumptions, theorems, and proofs — if are missing some of those, then are on thin ice in mid spring.
sigmaalgebra Apr 6, 2019

Okay, I read the bios and the 10 recommendations:As we know, men concentrate on things and women, on people. Well, those two women are concentrating on people, society, and various political issues. For the “things”, the actual technology, they apparently have no reasonable definition of AI and have decided to call nearly any application of computing past an old scientific calculator AI. So, they have expanded AI to cover anything with bits, bytes, software, or the Internet “AI”. Hmm, that’s awfully broad.Also, it is clear enough (1) they want POWER, especially rules, restrictions, advisory and oversight bodies, regulatory bodies, BoD slots, open source of proprietary intellectual property, onerous auditing, legal; (2) they are radical feminists; (3) they are into the radical stuff about diversity about not just gender but also “sex”.Long ago I saw a lot of really destructive wack-o. It appeared clearly that the wack-o I was seeing was from human females. An expert explained to me that the wack-o commonly starts at about age 22. Then I saw that usually a woman with several of her own young children who was a stay at home wife with a good husband did quite well concentrating on her children and avoided any very serious active wack-o. She was really involved with, dedicated to, really busy with being good as a MOTHER to her children. She was astoundingly attentive, perceptive, solicitous, affectionate, caring, protective, supportive, etc.The situation was as if the old “A woman’s place is in the home, barefoot, pregnant, and dependent.” had some fundamental support and truth.And it appeared that if she was not a mother, then starting at about age 22 year by year she would fall more and more into destructive wack-o as if it was to incapacitate her for practical things and render her desperately dependent on a strong man, all with, net, reproductive advantage, that is, make Darwin happy.A big theme in all of this wack-o is anxiety: The women are afraid of things. There are credible claims that even across societies, races, and continents, women are four times more likely than men to have anxiety problems. One of the standard consequences of high anxiety is obsessive-compulsive behavior. It appears that the extreme focus, dedication, devotion of such behavior has, net, for mothers had reproductive advantage.So, in particular, the two women here appear to be all wound up, as Dad used to say, “as tight as the spring in a dollar alarm clock”, seeing threats from computers and looking for rules, etc. for security against those threats. Net, they are going wack-o.It has appeared that such wack-o was Darwin’s idea and long had reproductive advantage, and that’s my best and only explanation that fits the facts of what ruined my marriage, kept me from having children, and killed my wife. Sorry, I have to take this explanation seriously.My wife was intellectually brilliant: Valedictorian, Phi Beta Kappa, Summa Cum Laude, Woodrow Wilson and NSF Fellow, mathematical sociology Ph.D. with professors the two best in the world, Rossi and Coleman, and with nearly no computer background learned our work in AI after two lectures from me, 15 minutes each. She was brilliant. But her anxieties overwhelmed her rationality. She emotionalized instead of rationalized and went for wack-o. It was fatal.Yup, my view of the two women there at NYU is that mainly all they are doing is showing that they are not all wrapped up in having children and, instead, are going for wack-o that would usually render them dependent and have reproductive advantage. Sorry ’bout that.At one time I strongly believed that “women don’t have just to be cared for; women can do things, too” — biggest mistake of my life, short of actually dying, about the biggest mistake a man could make.My advice to these two women on AI is — f’get about it.
1. Pete Griffiths Apr 7, 2019
  
  This is pretty crazy.
  1. sigmaalgebra Apr 7, 2019
    
    I needed something to fit the horrible data; the above took me 20+ years to figure out and start to believe. Major parts of it are on some of the more solid ground of the mental health community.Commonly girls grow up cherished, treasured, protected, loved, cared for. They respond with smiles, being cute, sweet, pretty, darling, adorable, precious, perfect. In everything in K-6, they are MUCH better than the boys. In nearly everything in 7-12 they remain better. In the non-STEM fields in college, they are still better than the boys. In working with people, when they want to, they remain better than the boys/men for life.So, no one wants to believe, say, suggest, suspect that at about age 22 either (A) they are all wrapped up on the mommy track or (B) they start to go wack-o in ways that incapacitate them for too much of the real world and render them dependent on a man, with, net, reproductive advantage. And since the girls/women are so good at acting and filling roles to please others, they can cover up a lot of both (A) and (B): For (A) they can have children and then look productive in the real world, even in the STEM fields, and for (B) they can come up with excuses for why they don’t have children and pursue work so much they look even good at it. Since there are a lot of norms and expectations, there can be a lot of acting going on. So, it’s super tough to believe the importance of (A) and (B). And the cases I saw can be discarded as outliers, but I concluded that the outliers where (A) and (B) were easier to believe also described similar but weaker situations in much of the rest of the pack.To show this stuff scientifically would be a heck of a lot of work trying to do good science in social science. But too often in life, if we wait for solid science, we wait too long in life.In the case of my wife, lots of people tried to help, but no one did, and I was the only one who came up with an explanation that fit the facts. And a lot that I used is from the more solid material in some of the relevant fields. And, if look at her mother and sisters, the cause of the problems was NOT me.I’m moving on with my startup. But, then, things pop up, e.g.. Fred’s blog here with these three women getting all wound up about the social, political, ethical, legal, etc. aspects of AI when AI is not even well defined. Or, (i) for social media, it’s easy — don’t use it. I don’t use Facebook hardly at all. Never used SNAP or MySpace. For (ii) politics, we have big elections, hard fought, each two years. If anyone has more good content about the elections, then post it on the Internet. In trying to understand politics, I like Gingrich. In trying to know what is going on in politics in DC, I pay attention to Gingrich but mostly just pay attention directly to the politicians themselves, especially Trump, especially on Twitter and YouTube, but, sure nearly all the other powerful ones, Graham, Pelosi, Schumer, …. For (iii), the US and Western Europe agree fairly strongly on some quite strong ethical principles; if something about AI violates those, then people will push back like they do with other ethical issues. For (iv) legal, we have DC working everyday on legal stuff, passing laws nearly daily, courts challenging them, etc.For the issue of Internet privacy, the EU came up with their General Data Protection Regulation (GDPR); on first glance, it looks good to me, and it seems that my startup will be okay with that right away. E.g., at my Web site, users don’t have user IDs or passwords and don’t log in, and I do nothing with cookies and next to nothing with JavaScript. I use no past data on users: If two users enter the same data on the same day (i.e., I have not changed the software between their uses), then they will receive the same results. On what the ad networks might do I don’t have details yet; maybe the ad networks get the IP address of the user, and maybe even the “agent string” their browser sends, but I haven’t looked into that yet.Look, ladies, the sky is not falling yet.
Pete Griffiths Apr 7, 2019

” I don’t personally love Kate and Meredith’s answers on how society should be thinking about these issues. They feel very “20th century” to me.”Could you be more specific?
Matt Zagaja Apr 7, 2019

In Cambridge, MA I sit on a body called our “open data review board”. What this means is that once a quarter I participate and help set policy/direction for the city around data. I do this because besides being a lawyer I lead a community group called Code for Boston where lots of folks use data. When I worked in politics we used machine learning AI to target voters and I had to train and help our community members understand what that meant and how it worked, or otherwise they would just ignore what the AI suggested. There is a big gap between how AI works and what companies know about you and what people understand about this subject area.First people do not understand the amount of data they are sharing, and when they come to understand it they are not comfortable with it. The biggest backlash we faced when I was in politics was when folks received post cards letting them know their neighbors voting histories. This is all public data and has been for years, but people found its availability and disclosure invasive. Part of it might be the case that they are overvaluing privacy. But as a society isn’t that a choice we should make together? The rules we have should reflect our society’s values.Second if the tech industry does not want to suffer the backlash they have seen with things like GDPR and Facebook then they need to be more proactive in building their platforms and policies with their users. As a web developer I know that data and AI can help us better understand how users are using our products and what they need or want. User research can give us insight into how they feel about it. I believe if you want to be a long lasting successful tech company you need to put users first and genuinely engage them.
André Rocha Apr 8, 2019

Thanks for this