# Video Of The Week: John von Neumann on K12 CS Education

As many of you know, I have spent a fair bit of my time over the last ten years on increasing the amount of CS Education in our K12 system in NYC and around the US.

My friend Rob sent me this short (2 1/2 min) clip of John von Neumann in the early 50s talking about how important CS Education and in particular K12 CS Education would be.

We largely ignored his advice for the last sixty years but I am optimistic that we are finally heeding it.

## Comments (Archived):

I liked his answer about what comes before scholarships. He repeated the need to train teachers.Got a kick out of these wired microphones!

When I first got to Stuy my chairman told me about a conversation he had with Marvin Minsky a few years prior. He asked what Minsky thought the high schools should be doing in terms of CS education. Minsky replied “nothing” since HS teachers would just screw things up for the college folk.It took a bit of convincing to bring my chairman around. He always liked the idea that I brought more CS for the CS inclined kids but getting him to agree that it was something for all students took some work.

It is amazing how many chances life provides to do important things, important things give you opportunity after opportunity. Conversely, life’s scams have windows of opportunities that close quickly.

Gee, terrific, von Neumann!He wasn’t talking about what he had done on Hilbert space, spectral theory, a really novel proof of the Radon-Nikodym theorem, shaped charges, quantum mechanics, etc.Really nice looking times!!! The women, the WOMEN, the WOMEN, uh, they looked like, uh, RIGHT, WOMEN!!!!! Amazing that back then we had lots of actual WOMEN!!!!!Yes, even Johnny knew that he couldn’t make much progress just analytically with the Navier-Stokes (fluid flow) equations and needed to fall back to just numerical approaches. Okay; we all have to agree with that. So, he cooked up the von Neumann architecture.Early in my career, I was trying local analytical, power series, approaches along with global numerical approaches to the Navier-Stokes equations for simple cases, e.g., spherical submarines, maybe a little more realistic than what physicists are tempted to do with spherical cows, but, right, the big swimming pool towing tank was closer to reality.Yup, one of my favorite books is P. Halmos, Finite Dimensional Vector Spaces. (FDVS). Halmos had just gotten his Ph.D. from J. Doob, as in Stochastic Processes and martingale theory, amazing stuff, martingale theory (every stochastic process is the sum of a predictable part and a martingale, and every L^1 bounded martingale converges to some random variable; the easiest proof of the strong law of large numbers; some of the most powerful inequalities in math; can build a career in academics just running around cooking up problems were can apply martingale theory) at U. IL, and went to the Institute for Advanced Study (IAS) in Princeton and asked Johnny for a job and became an assistant to von Neumann.FDVS is a finite dimensional vector space introduction to Hilbert space (the finite dimensional vector spaces are also Hilbert spaces; the real line is a Hilbert space; but the greatest interest is in how the finite dimensional situations extend to the infinite dimensional situation, e.g., for the Fourier transform, signal processing, and more) and spectral theory (“in quantum mechanics, the observables are eigen values”, that is, from spectral theory) good enough that physics students starting with quantum mechanics have often been advised to set aside a direct study of (possibly infinite dimensional Hilbert space) and just read FDVS. FDVS is darned elegant. Halmos was one of the best writers of math in the 20th century.That is, physics wants to study quantum mechanics; they want each of their particle wave functions to be a point in a Hilbert space; so, for Hilbert space go to von Neumann; but first take the training wheels of von Neumann’s assistant Halmos in FDVS.FDVS did me a lot of good: As a ugrad, I’d had a course in abstract algebra that, sure, touched on vector spaces. Later I took a reading course in physics that did a little more. Later I got the notorious Princeton honors thing, Nickerson, Spencer, and Steenrod (all famous math names) Advanced Calculus, an early attempt, with AWFUL typing, to teach undergraduates the exterior algebra of differential forms, (i.e., bookkeeping of the multidimensional version of the common physics and engineering line integral) and did some more, e.g., Gram-Schmidt (how to get a gorgeous orthogonal coordinate system out of just any garbage data and prove one of the crucial results on the way to the crown jewel of the polar decomposition). Later I wrote my honors paper on group representation theory (from E. Wigner’s work on quantum mechanics of spectral lines from molecules) and, right, the means of representation are unitary matrices very close to spectral theory and unitary in quantum mechanics. Later I got the E. Nering book on the subject — Nering was an Artin student at Princeton — and went carefully through nearly all of that. There was some linear programming in the back (curious approaches but highly inferior to later, really nice approaches) and on my honeymoon that got me a job offer at IBM’s Chicago branch office to do hand holding with oil people wanting to see how to crack the crude to get the most total value of the resulting fractions — that problem remains important with better non-linear approaches.Then I got FDVS and read quite carefully — it was clearly an elegant book. He has an ergodic result in the back, and now that I have been through high end versions of ergodic theory it’d be nice to take two hours and compare with what FDVS has since when I was reading FDVS I didn’t know what ergodic theory was about (pour cream into coffee and stir and keep stirring and eventually the cream and coffee will return as close as you please to where they were when first poured in the cream, due to H. Poincaré and due to measure preserving I used in a paper I did in anomaly detection). Later I got pushed into multivariate statistics, and, sure, the geometrical emphasis in FDVS was really good to have. Then got pushed into numerical aspects of linear algebra, e.g., now in Linpack, some very nice work, and a really curious way to do linear algebra numerically exactly with good efficiency with normal machine arithmetic (all based on some number theory) of M. Newman. Later got pushed into the fast Fourier transform which is also closely connected, also with more oil patch work, Navy submarine work, etc. Then I sat and wrote a manuscript of something like a draft of book on the subject.Later I got Johnny’s Quantum Mechanics and got through about the first half (close to FDVS) and was having a great time when I got pulled in another direction.Then I went for a Ph.D.: There was an advanced course close to FDVS. I told the faculty I thought I didn’t need it. They smiled and suggested I take it anyway. Okay, to review, at school, the night before the first class, with my draft book still at home, I wrote out 80 pages or so which covered about the first 2/3rds of the course. The course was strictly graded, in detail. Bummer. The grader made a mistake on one of my papers, and I corrected him and he made no more mistakes. About 60% of the way through the course, the grader asked if I knew how I was doing in the course. I thought I was doing okay. The grader showed me the class scores on homework, the tests, and the midterm, and I was beating all the other students by wide margins. Same through the end and the associated qualifying exam. Horn has been a world-class name in that subject, student of C. Lowener at Stanford, similarly. But the course was a bummer for me: I was usually going with nearly no sleep on Thursday nights doing the long homework. I was so tired doing it when I got the paper back I recognized my handwriting and knew the material but had no memory of doing the work. That wasn’t education, it was filtering. I got the impression that my knowledge of the subject was resented, that I had messed up the reliability or some such of the filter. Soon I did some research that showed that I deserved to do well in the filter.When I was reading FDVS, I had gotten to the Hamilton-Cayley result where Halmos has a nice proof but less general than the one in Nering. But Nering was interested in algebra (I’m not), and Halmos was interested in analysis (I am). So, for the difference I wrote Halmos a letter on the difference and got back a nice answer, saying that I obviously understood his book! I included the letter with my application to grad school. So, when the course got to the polar decomposition and I spoke up in class “That’s my favorite theorem!”, Horn responded “Thank you Dr. Halmos.”. Horn was so shaken he didn’t continue with the proof!Uh, students, teachers: I never took a course from the books by Nering, Halmos, Nickerson, Spencer, and Steenrod, on numerical linear algebra, on multi-variate statistics, on the fast Fourier transform, quantum mechanics, etc. Computing? Did a lot of it; taught a lot of it; essentially never took a course in it. That’s actually the way it SHOULD be: It just AIN’T all in the courses, guys.The polar decomposition is good math, elegant, powerful. A proof can be surprisingly short, and the number of applications surprisingly long and valuable.The Radon-Nikodym result was crucial to my dissertation and some other work. W. Rudin gives von Neumann’s proof.Yes, von Neumann, Halmos, Rudin, some of my favorite authors!Yes, if back there when women were women Johnny was saying that there was a great future in computing, he was correct. And we’ve charged ahead, with results beyond belief, in hardware, systems software, infrastructure software, applications software, communications hardware and software, data base, algorithms, software development means, etc. We’ve definitely junked the typewriters, typesetters, library card catalogs, a lot of paper, mechanical adding machines, slide rules, electronic calculators, land line telephones, film cameras, audio tape, video tape, and much more.But, for middle school students programming smart phones?Uh, Johnny got interested in computers, the von Neumann architecture, for ways to make progress on some important applied math, e.g., the Navier-Stokes equations. Well, my view is that the value is not learning the iPhone software development tools but the real purpose, the applications, including the ones Johnny had in mind.Or, computers take in data, manipulate it, and report results. We want the manipulations to be powerful so that the results will be valuable. For more powerful manipulations, often proceed mathematically, e.g., with the polar decomposition which totally blows out of the water with the doors blown off the mostly just intuitive and heuristic approaches of computer science, artificial intelligence, machine learning, data science.So, I’d advise the middle school students to go light on the latest version of the iPhone software development tools and, instead, concentrate on (i) the real problems where we need valuable solutions and (ii) the powerful means of data manipulations, most centrally pure/applied math, for the valuable solutions. Here (ii) is challenging. But (i) is, if you will, problem identification, the first key to good projects, and maybe even more challenging. The students DID do some of (i): So, they saw some of the challenge and likely see the need to do some more.Johnny’s work will remain important for hundreds of years; the iPhone development tools will age like butterflies.Maybe Johnny paid attention to the women then!!!!! Tough to ignore those women!!! Johnny’s friend Uncle Al there at the IAS did, including a friend then 15!Ah, those WOMEN — back when the US was doing well in family formation, growing the economy rapidly, etc.!US family formation SUCKS. Where can we get back to real women and real families?Ah, a YouTube video clip, and someone FOUND it!!!! That can be a challenge. Is there a problem there?