Sep 6, 2015

Recommending Recommenders

It seems like more and more of my engagement with various services I use involves some sort of recommender. The new version of Google Maps on my phone recommends a certain way to get from one place to another along with a couple other options. SoundCloud and Spotify are generating awesome recommended listening streams to me. Twitter tells me what I missed when I was away and gives me Highlights on my phone. Gmail recommends who else to send the email to. Etsy shows me things I might want to buy.

I am sure all of you are experiencing the same thing. Web and mobile apps are getting smarter and smarter about each of us and recommending things to us that until recently we had to figure out all by ourselves. It almost seems like recommenders are table stakes these days. You can’t even play in the game unless you can do this sort of thing. And that requires a data science team to sift through all the data on your service and make smart recommendations to your users.

This is one of many things that has tilted the web and mobile game in favor of the larger and more mature companies. But there are also tools that you can use to get machine learning as a service to compete with the big guys. Our portfolio company Clarifai has an API for machine learning for images and video, for example. If you are building a service that has a lot of images and video and know you need to build a recommender but don’t have the machine learning expertise in house, you may be able to do a lot with Clarifai.

My partner Albert calls this sort of thing the “unbundling of scale” and it is something entrepreneurs need to do more than ever as the big “new incumbents” are turning scale into advantage using data science.

#entrepreneurship

Comments (Archived):

William Mougayar Sep 6, 2015

Amazon too. They probably had the first recommendation engine out there.
1. awaldstein Sep 6, 2015
  
  Amazon invented the recommendation platform, destroyed the need for experts in most categories and simple bristle block matching changed the consumption habits of the last decade or so.It’s not the ‘try this’ that does it. It’s the ‘try this’ with a soap box of people who give their opinions. I buy everything from a cat treat (1500 personal opinions!) to refrigerators and a $20K industrial Robocoup from this platform.People on Amazon are the data. Not just the 4 stars of how good it may be.
  1. PhilipSugar Sep 6, 2015
    
    I agree completely the people on Amazon are the data.
  2. William Mougayar Sep 6, 2015
    
    Great point. People are “part” of the data. I’m sure Amazon sifts that a bit.Contrast with Netflix recommendations which have typically sucked for me. I heard they are biased to show the movies that have lower licensing fees to Netflix.
    1. awaldstein Sep 6, 2015
      
      It a function of category as well.We buy everything from Amazon that is a catalog item where people’s opinions and the ease of ordering/delivery/return matter.Note that Amazon has failed twice big time and is in the process of failing a third time (in my opinion) in their wine initiatives.Why? Things that can be bought with confidence work. Things that truly need to be sold, like wine. don’t.
      1. William Mougayar Sep 6, 2015
        
        True. Wine recommendations are more tricky.
      2. awaldstein Sep 6, 2015
        
        Except for some clubs and of course direct from producer lists all wine online has failed to date.Wrote this a while ago but still my thoughts on this.Wine needs to be sold, not bought http://awe.sm/iOB8I
      3. LE Sep 6, 2015
        
        Maybe the truth is (from my perch of knowing zip about wine) is that there isn’t enough clear differentiation of the product that without “an Arnold Waldstein” adding color commentary (and enthusiasm) the product doesn’t actually stand on it’s own by it’s actual merits. Unless of course someone is familiar by experience overtime with the art of wine and has developed that appreciation.Or it’s an acquired taste in the sense that giving an olive or an anchovy or cigar to someone who is not familiar with those is not going to automatically love the taste. (Nobody needs to acquire a taste for chocolate or ice cream cake for example). So the role of “the Arnold” is to simply get people to the point (by enthusiasm, authority or respect) whereby they are able to appreciate the art and differentiation of the product on their own.
      4. PhilipSugar Sep 6, 2015
        
        They are tricky because they need to be two way. Buy this wine because you liked that wine. Well I liked that wine with my crab cakes, but today I am eating steak.
      5. awaldstein Sep 6, 2015
        
        Wine is a tough one.I can guarantee that when we go out for a drink at my neighborhood wine bar I’ll be able to get a bottle in front of us that you will love and have never tried before.And it will be fun and informative and a kick.Assisted buying unless you are buying wine as an asset is part of the pleasure of drinking it.Rock star Somms are as valuable as the chefs nowadays and great ones as rare.
      6. William Mougayar Sep 6, 2015
        
        yes…many variables at play. But life is too short for bad wine 🙂
      7. sigmaalgebra Sep 6, 2015
        
        That’s a huge point:Approach 1: Learn about the person and give all results based on that.Approach 2: For each query, search, purchase, recommendation, etc., treat it and the person as unique in all the world and learn only in that context.Approach 1 might work okay for some ad targeting but is, for the reason you mentioned and more, close to a bad joke for more than just ad targeting.Why the difference? With ad targeting still have a shot selling a Toyota even though the user is looking for running shoes, wine, or a movie DVD.But vague information, e.g., demographics, on the person still won’t work well for recommending wine, etc., if only for just what you said: In some cases you want a classic red with a lot of body and bouquet (La Tâche, Romanée-Conti, Barolo some Haut-Medoc), a great white for some seafood (Montrachet, Pouilly-Fuissé, something good from near Macon, a Soave), some Champagne, something red but simpler, e.g., a Beaujolais, a classic, sweet dessert wine (Chateau d’Yquem, Beerenauslese), for a dinner based around Italian red sauce, Chianti, a sweet, sparkling dessert wine, Asti.Net: Just some demographics on a person is junk for a wine recommendation. What wine they liked the last time is nearly as useless.
      8. PhilipSugar Sep 6, 2015
        
        I have inside knowledge about Amazon’s work on recommendation. It is really, really, tough and they work on it a ton, but as you state trying to get it right is very hard.I know a gaming company that tried it as well (if you bet on this wouldn’t you like to bet on that??) Two issues, when you get it wrong it is super annoying, and I agree it needs to be a two way discussion (which is what selling is)
      9. LE Sep 6, 2015
        
        I agree it needs to be a two way discussion (which is what selling is)Yes. In selling a two way discussion is merely a way to be able to hit the hot buttons of the person that you are selling to in order to brainwash the message that actually matters to them as opposed to what you think should matter to them.If I am buying a machine for a business and my primary concern is “service response time of 2 hours or less 24x7x365” then a good salesman shouldn’t be spending a larger percentage of the sales call driving home other benefits of the product that don’t matter to me. It’s easy if you simply pay attention to someone’s reactions and verbal and physical cues. Not everyone can do this. Note that this doesn’t mean spend the entire sales call on service either. Obviously.All of this involves knowing the needs of the particular customer that you are selling to. So you can target the correct message to them.If I was selling home improvements to you the first thing I would uncover in discussions is that you really cared about quality of the product (I am sure that would become apparent in any discussions with you or I could just ask you some questions and uncover it). There is no step by step for doing this you sort of feel your way along w/o throwing up any obvious cues that that is what you are doing.Otoh if I was selling to your wife maybe she would care more about the mess and the dust and interruption of home life.
      10. LE Sep 6, 2015
        
        I didn’t even know I could buy wine on Amazon. I just did a search for “wine gevurtzimer” [1] and Amazon doesn’t know what I want.[1] (sic) – google knows what I mean more than amazon does)…
      11. awaldstein Sep 6, 2015
        
        they won’t know what it is if they don’t sell it most likely.and since all id domestic and very little gervurts is made here that is most likely your answer.they will fail but not because their search is bad.
      12. LE Sep 6, 2015
        
        Actually it seems that Amazon sells Gewurztraminer if you spell it correctly:http://www.amazon.com/s/ref…Until this thread today I had no clue they even sold wine even though I spend a ton at Amazon on all sorts of things.The only reason I found out that I liked that wine (and Riesling) was that my brother in law (who is into wine) brought it to a dinner several years ago. So the question also is where is the “if you like this you will also like that” in all of this. Believe it or not many people don’t even know that “dry” means “not sweet”. (I didn’t for example..)…
      13. LE Sep 6, 2015
        
        Things that truly need to be sold, like wine. don’t.Wine is partly a “party in your brain” product just like a Porsche or a Tesla is.I think you are right but however I would modify that to say “things that need to be sold that involve taste or smell are much harder to market that way”. Assumes that nobody knows upfront or has knowledge of the particular product that they are buying of course.For example the following products definitely need to be sold but do quite well online:- Expensive Watches – Some high end Jewelry- ArtOne reason is that they don’t involve taste or smell like wine does.Mouth.com sells food but unlike Amazon marketing wise is angled towards overcoming the limitations of being able to taste or smell the product. For wine: http://www.mouth.com/collec…Mouth also fails the “wine gevurtzimer” test with their search but yet they sell the product. That’s easily fixable I wonder if they monitor their logs for this type of thing (they should).Ditto for wine.com
      14. awaldstein Sep 6, 2015
        
        your logic breaks i believe when you apply it to wine.as how people buy vegetables is not correspondent to how people buy wine. in any way really.they are both food and scarce and perishable by definition.they should be subject to the same food certs and disclosure.beyond that nothing is similar to how consumers choose.as a rule,with some exceptions all non assisted buying online or off invariably have failed except that the very highest end for collectors and the fattest lowest end where it is all price and low end branding.for someone with lots of knowledge the basic equation of grape/region/vintage/producer can hold but that breaks as well.
      15. LE Sep 6, 2015
        
        I am not seeing a close connection (in perishability) between wine and vegetables (if I understand what you are saying … I may not). Vegetables are perishable. Feel free to correct me but wine kind of stays a long time (even the crap that I would buy) not to mention whatever happens to certain wines with aging (I have no clue but I know old wine is valuable). Like for years at least, right?Closest I can think of is that for the majority of people wine buying is not anywhere near as important as it is to wine aficionados. Anymore than buying cheese is. The majority of the market for cheese is the “crap” that a company like Kraft sells. Not the specialty cheese bar at a specialty foods store. Small counter not very active actually (relative to the volume of the entire market that is.) Leads me to believe that with wine (or cheese) people are merely looking to satisfice with the purchase. No low hanging fruit of opportunity with the product. After all why do most people drink liquor? For the buzz? Or for the happiness of the art? (My guess is “for the buzz” for the “majority”). Why do most people bring a bottle of wine when going to someone’s house for dinner? (I said “most” not “Arnold and his ilk”). Because it’s a good way to “bring a gift” when you are dining with friends. Most people don’t even know who brought what bottle.That said, if I had a clue I would get more out of it from a hobby and enjoyment perspective. But it’s not that important to me to put in any effort. I suspect most people are like me as well.
  3. Richard Sep 6, 2015
    
    Aren’t the people and the star ratings the same data on Amazon?
    1. SubstrateUndertow Sep 6, 2015
      
      Those stars are attached to some very rich descriptions of the visceral experiences that accompany each product.Visceral experiences that can be easily aliened with your own subjective product-quality priorities.
2. Twain Twain Sep 6, 2015
  
  At the moment, Amazon has the best recommendation engine because it’s the directly tied to purchase and specific text mining on the customer feedback.All the other recommender engines are based on intent.They can, as Facebook and Twitter do, try to migrate users over to purchase via BUY buttons.However, there are challenges in that implementation.There’s another way of doing it which would involve changing user interaction form factors. This may not be readily accepted by existing users, though, because then it’s a whole new value proposition, purpose and experience.
3. Bruce Warila Sep 6, 2015
  
  banner ads + adtech prior to Amazon?
  1. William Mougayar Sep 7, 2015
    
    Ad tech as intelligent targeting is a pretty low bar 😉
PhilipSugar Sep 6, 2015

I am not dismissing or downplaying the role of data scientists. But there is a ton of just simple “blocking and tackling” work that can be done here as well. You can survey people and then actually use the results for example. Companies have gotten better at this. For instance just ask me: high or low floor?? near or far from elevator?? fruit plate or beer?? when I check in.Then deliver! That is the differentiator.But sometimes when I get recommendations that are wrong its worse then none at all. Yes I know you think I like windowbox herb gardening because my wife bought some books on my account, but I find those things get in my way and are annoying.
1. William Mougayar Sep 6, 2015
  
  True on the possible misinterpretations, but I think what you described at the end is part of re-targetted advertising and they use just 1 data point: your search history. That’s a hit and miss kind of recommendation. I think the more data points, the better accuracy typically.
  1. PhilipSugar Sep 6, 2015
    
    Yes, but as awaldstein correctly points out it is the recommendations which make Amazon. (that and the execution of their delivery)
2. LE Sep 6, 2015
  
  For instance just ask me: high or low floor?? near or far from elevator?? fruit plate or beer?? when I check in.Don’t hotels have other constraints and reasons they place people where they do? [1] And also will giving better service as you are suggesting (w/o someone asking) move the needle and overcome the negative of the effort? Will they actually do more business and build good will? Will it change your rewards club tendencies?Restaurants almost never ask you what kind of table you want (other than “outdoors ok?”. I always want a booth and ask for it. I don’t want to be right on top of anyone else jammed in. But does anybody? If that were part of the reservation or they asked when I checked with the receptionist then they either have to accommodate me or tell me “sorry can’t do that” which is a negative. So maybe better not to ask if the ratio of requests to delivery is going to be small.[1] For example hotels have maid staff, room maintenance and perhaps they don’t want kids running down the hall to a room at the end. Or they want equal wear and tear on room inventory. If you ask you either need to accommodate or you are creating a negative. If you go to the front desk and ask “close to elevator” and the hotel says “sorry can’t do that” then there is a negative because they haven’t given you what you have asked for.
  1. PhilipSugar Sep 6, 2015
    
    No I get the same exact seat on 90% of my flights and a room with what I want 90% of the time.
3. OldManGoldenwords Sep 7, 2015
  
  I don’t know of any one who happily answers a survey. Either they don’t do or even if they do they aren’t honest, bcus they want the survey to end fast. The best recommendation I had found was Youtube. Something recommendation leads me to part of discovering new things. They work so great bcus people can lie but their action wont.
  1. PhilipSugar Sep 7, 2015
    
    Guess survey is a bad term. Yes, filling out some long survey for the benefit of the company is worthless. Which is why people use the net promoter score.But if you ask me a question and act on it, that is powerful. I like rooms away from the elevator (noise) I have a co worker that likes them near elevator (convenience)
Richard Sep 6, 2015

Aren’t companies like zenefits a third scenario to the bundling unbundling debate?
kirklove Sep 6, 2015

It’s amazing how much good old “human” power is behind all this:1) Google maps and street view, human driven2) Spotify human editors3) Amazon human reviewers4) Etsy human curators 5) Kickstarter human featured projects6) Waze – human, real-time updatesetc…Algorithms and APIs are great, and getting better and better, though the human curator is still far more accurate and nimble.
1. William Mougayar Sep 6, 2015
  
  I think it’s both & becoming a mashup of human + machine together.
  1. Twain Twain Sep 6, 2015
    
    I’m a big fan of the Human<=>Machine model wherein it’s people who CALIBRATE the algorithm.
  2. sigmaalgebra Sep 6, 2015
    
    Yes, in a sense, that’s what my approach is.No way do I want to use data science to evaluate a piece of music, a painting, a video clip, or a blog, i.e., an instance of Internet content.Uh, more generally, data science is a lot of heuristics that appear to work on some example cases of data or just statistics done badly, e.g., with no hypotheses, use of theorems or proofs, or results known in advance of the computing. That’s why the more serious books on statistics are packed with theorems and proofs.One approach in recommendations, discovery, search, etc. is to involve keywords/phrases again, e.g., match, e.g, for similarity, instances of content based on some given keywords/phrases or somehow get some keywords/phrases out of the content and then match based on keywords/phrases.E.g., here’s some of the weakness of keywords/phrases: Can do a Google search “home decorating” blogs and get About 1,180,000 results apparently the number of results can vary a lot by day.So, likely nearly all the results are blogs, but, still, we are stuck-o: So, look at the first few dozen, i.e., in some sense the most popular, or think of some more keywords/phrases and try again.Net, from this example, and easily could cook up thousands more, even for content based on text, search by keywords/phrases is not very good.Why not? It’s simple: Humans don’t give a darn about keywords/phrases — else a dictionary would be great reading. Instead, humans care about, may I have the envelope, please, drum roll, please, the meaning of the content.So, how the heck to write computer software that can work with, handle, meaning?My approach has no data science and, instead is just some applied math. The applied math has theorems and proofs, and those are an advantage.Now that I have my software running, sometimes I ask people for some feedback: Q. Suppose you do a Google/Bing search and get 10,000,000 results. What do you do then?A. Look at the first few!Q. Suppose in about five minutes I can get you the 1-2 dozen of the 10,000,000 you will like best?A. How do you do that, by some psychological test.Q. No, that would not be accurate enough for doing well with search, discovery, recommendation, etc. It could work for some ad targeting.Instead, the five minutes is spent on learning what you will like and not in general but just for that one search treated as unique in all the world. So, our results of search, discovery, recommendation are fully personalized and not really to you but to your particular interest in that search, etc.The key to our value is that in the five minutes we learn enough to do well giving you the content that has the meaning you want.We don’t like to say learn because that sounds like some computer science thing; with us, it’s not and, instead, is, internally, just some quite precise mathematics.We do well protecting your privacy: We don’t ask you to login in or enable cookies. If you do another search, we have no idea what you entered on previous searches. AVC community feedback to these questions is also welcome!The software is 18,000 programming language statements in 80,000 lines of text.What is on the80,000 – 18,000 = 62,000lines of text? Well, on average each programming language statement is two lines long, and nearly all of those statements are preceded by a blank line. So, that’s3 * 18,000 = 54,000lines. So what is left is80,000 – 54,000 = 26,000lines, and that is comments in there to help any reader understand the code.Somehow when type in that much code, can get a lot of dust in the house and dirt on some of the floors. So, yesterday I got torqued, got out the 2.5 horsepower, 20 gallon, wet-dry vacuum cleaner (dainty little thing), and made a lot of progress!
    1. Twain Twain Sep 6, 2015
      
      So you’ve built a contextual meaning-based search engine?Is this what you’re saying?
      1. sigmaalgebra Sep 6, 2015
        
        Can’t say because I’m lacking a good, clear definition, description, explanation, etc. of a “contextual search engine”. And I don’t know that so far it’s reasonable to say that there is a real class of search engines there.
      2. Twain Twain Sep 6, 2015
        
        Contextual search engine being pretty much the process you described in your answer to Philip Sugar’s comment below.
    2. Twain Twain Sep 6, 2015
      
      Google would argue that they can already do contextual meaning-based search engine via their semantic tags plus cookies in browsers for our previous activity which provides another dimension of context plus combination of clicks and ratings.
      1. sigmaalgebra Sep 6, 2015
        
        Thanks for your feedback!In my usage of Google/Bing, etc. (below, just Google), I can’t get anything at all similarly useful to what I’ve programmed.For what Google has tried to do, say, originally a computer version of a library card catalog subject index sorted by a measure of gross popularity, Google is just terrific.I.e., when (1) know what want, (2) know that it exists, (3) and have some keywords/phrases that accurately characterize what want, then Google can be just terrific, a step up in civilization and the ascent of man. Terrific.E.g., from Google query “round up the usual suspects” likely could get the transcript of the movie Casablanca. But, again, for that search, have all of (1), (2), (3).Next, the Internet content that Google is really good at is based on text, or at least a lot of text in metadata. Then for other content data types, e.g., recorded music, still images, video clips (with high irony, at YouTube), old movies and TV shows, and more, the effectiveness quickly declines to poor to useless.Indeed, as in my example of home decorating, even for content based on text, Google commonly does poorly. And why? Don’t have all of (1), (2), and (3).For cookies in browsers for our previous activity They are no doubt useful for ad targeting but IMHO are next to useless for search, discovery, recommendation, etc. E.g., see my example of looking for wine inhttp://avc.com/2015/09/reco…elsewhere here today.E.g., maybe you have been shopping for athletic clothes, including running shoes. So, you do another search for running shoes and get, again, the results you used last time. Alas, this time you are shopping for your niece who is 9 years old. Bummer. You have bought some movie DVDs, but now are looking for some to entertain some kids 5-7 while the adults have nice long, classic French dinner.The difficulties go on and on this way.Moreover, IMHO getting at, i.e., handling, meaning is much more difficult than can be achieved with cookies, etc. By meaning I’m talking, just for one example, artistic content based on the user’s personal artistic taste. Just have no hope of getting that out of cookies, etc.Going from syntax and semantics to meaning is a Holy Grail problem in computer science. That we can make good progress here is surprising.
      2. Twain Twain Sep 6, 2015
        
        Well…you do know that:(1.) The syntax provided by the likes of Chomsky and Minsky for machine extraction are incomplete (as much as I do respect their work).In fact, there are inherited syntax issues from Dr. Johnson’s dictionary and a whole raft of lexical databases.(2.) The semantic tags provided by W3C and Schema are also incomplete so this compounds the inability to parse for…(3.) Meaning which is contingent upon context that is more than socio-demographic and cookie-tracking.Yes, HOLY GRAIL problem and not at all trivial but deep and wide problems to do with data.Endemic, even, across the entire Web.
      3. sigmaalgebra Sep 6, 2015
        
        My approach has nothing to do with Chomsky!Instead, my approach is via some theorems and proofs in math with some advanced, and quite general, prerequisites. From that math, what I’m doing basically has to work.Sorry, Chomsky!Parts of the prerequisite math are astounding: At first glance it looks like, what the heck does this mean? It looks like generalized, abstract nonsense. Then close the book and try to prove it and begin to see it’s not trivial. And, then, without a solid proof, just will not believe it. Then read the proof and say, “Yup, it’s true”. Wow, darned clever proof. Wow, what a result! I’d never believe that any such thing could ever be true — but it is! Or, just look at what the result says for some examples — astounding. Without proof, wouldn’t believe it.And the set-up, that is, the previous 50 pages of the book, are much worse, astounding result after result tough to believe true without proof, and the proofs are darned clever.Then need to see how that result and the related material can be applied. That the material is extremely general is a big help — the results really are general enough for the work on meaning.We’re talking ballpark 50 years of some of the best math research ever, crown jewels of civilization. No one, ever, has any chance at all of reinventing all that stuff on their own — no one is that smart. It’d be like asking one person to beat Bolt in the 100 yard dash, Jordan in one-on-one, Brady in throwing passes, James in slam dunks over defenders, throwing a no-hitter against the Yankees, etc.Get the stuff about the way I got it or just don’t get it. Then can’t do the math I did and can’t write the software. Without the proofs, no one would believe it.Sometimes math can be an advantage.But the power of the math for meaning will be known only to me. The people interested in the search business don’t want to hear about math. The people who know the prerequisite math don’t care about business.No investor wants to hear about theorems and proofs — they never saw any money made that way.A long time ago I gave up on equity funding — totally hopeless. For the time and effort I tried on that path, I could have written a lot of the code sooner. The situation on equity funding is easy enough to understand just from the “The Little Red Hen” in Mother Goose. Both the hen and I have to make our evaluations early on, but no outsider will do that.My project is designed to need meager capex and opex until there is nicely positive free cash flow and then, for me, as for the guy who did Plenty of Fish, no willingness to accept an equity check, report to a BoD (I’d be terrified to do that), etc.So, it’s just me. I’m alone out there, but, quite broadly and for a lot of good reasons, that’s to be expected.I was alone out there when I did my Ph.D. dissertation: I thought of the problem, did the research independently in my first summer in grad school, and got direction later. And I got the university approval the risky way — do all the work, to the finished document, first and get the reviews and stand for the oral exam later. None of my advisors knew any of the details until I submitted the final document. Been alone before. One guy, Member, US National Academy of Engineering, did want one change: I added a few words to one paragraph to be more clear. Then I had the word processing type the whole thing again, and that was what was approved.Boy, did I like the results on regular conditional probabilities! Thank you Leo Breiman!But on my current project, the theorems and proofs are crucial for me; otherwise I would not have evidence enough to make the bet I am making.It’s a fact about math that usually a good, new result is understood by few enough other people to fit easily in an airplane wash room. So, mathematicians tend to be out there alone or nearly so. So, one nice thing about carrying some such math into business is that such success is one way to have other people appreciate, at least in a sense they can respect, some of the power of math. They won’t appreciate my work or the prerequisites until I get a 200 foot yacht and hold a nice party on Long Island Sound.Even then, they will appreciate only meagerly: E.g., I see little evidence that James Simons has many people trying to do much the same as he did. Instead, everyone else on Wall Street is trying to make money other ways, and sometimes they do.
      4. Twain Twain Sep 6, 2015
        
        A-ha, got it.You wrote, “No investor wants to hear about theorems and proofs — they never saw any money made that way.”Haha, well… what’s Google, Facebook, Microsoft, IBM Watson?Theorems and proofs and mathematical experiments about people made into products.Investors have made and will continue to make plenty of money from people who know theorems and proofs.Particularly in the Machine Intelligence and Data Science age.
      5. sigmaalgebra Sep 7, 2015
        
        The investors don’t think of those examples as being seriously dependent on theorems and proofs. Instead, they are much more comfortable with all the technology being just routine software in, say, C, C++,. PHP, Ruby, Python, C#, Java, JavaScript, etc.Read the bios of the venture partners and see how many BS and MS math majors you find. How many Ph.D. math majors — maybe in all the US there are 1-3. IIRC there is one Ph.D. math program dropout — I didn’t drop out. There are not even very many computer science majors. Instead we’re talking a lot of English, history, political science majors and MBAs. Also a lot of lawyers.That is, we’re talking non-technical people. Soooooo, guess what: They base their decisions on non-technical aspects. E.g., nearly no venture partners have the qualifications to be a problem sponsor at NSF, DARPA, NIH, NASA, Army Durham, etc.When I wrote my Ph.D. dissertation, I was more technical than my official advisors — I did not like that situation. Indeed, the university rules said that my orals committee had to have a Chair and majority from outside my department. My main advisor asked me to suggest a Chair, and I suggested a guy who was technical enough, e.g., a student of A. Tucker at Princeton, as in the Kuhn-Tucker conditions (where I also published a paper, written while I was a grad student and before my Ph.D.). So, the guy I picked could read my work. Good. Whew! Dodged that bullet.A BoD would not understand my work. When I proposed a new project, e.g., that would have to have a budget and, likely, need to be approved by the BoD, I would give an outline of the math, and after a few minutes of agony from understanding less than 2% (it’s just a matter of having the prerequisites), the BoD would rush to the rest rooms, and we’d have to replace the chairs and all the carpet on the path they took. I’d need a BoD Chair like the Chair of my Ph.D. orals committee, and there’s no hope for one.There’s just no hope: I want to use math for a technological advantage, barrier to entry, etc., and for that there’s no usual BoD in the US I could work effectively with. Hopeless.The math is an advantage, but there are two sides to that: (1) It’s an advantage and a barrier to entry partly because so few people understand that math. And (2) it’s a disadvantage, for the same reason — so few people, e.g., BoD members, understand it.Or, since nearly no BoD or equity investor could understand the math, the flip side is an opportunity, that is, very little threat of close competition.There’s the common: “Whatever you have thought of, someone else has thought of it before.”. Standard insult to entrepreneurs! And it can seem true to people with no experience with really new ideas as in good research. But, sure, take the set of all people who have thought of it, sort on date, and observe that for the person with the earliest date the claim is false. People in math think this way — people in business don’t.Another one is, ideas are easy, plentiful, and worthless. Good execution is difficult, rare, and everything. Hmm. Well, bad ideas are easy, plentiful, and worthless, and, when following a bad idea, good execution is difficult, rare and everything. A good idea is difficult, rare, and valuable, and following a good idea good execution can be routine and low risk. People in business with no contact with good ideas from research have a tough time thinking this way.If are really out there alone, as the first, way ahead, then, let’s see: One consequence is, right, are out there alone. Not a deep observation!
      6. Twain Twain Sep 7, 2015
        
        You know the “Picasso Principle”, Da Vinci’s quote “Simplicity is the ultimate sophistication” and Einstein’s “If you can’t explain it simply, you don’t know it well enough”, right?Investors and business strategy folks speak a language that’s different from how mathematicians speak and different from how users speak whilst art & design is a universal language.Instead of requiring others to speak our language, it’s equally important we learn to speak theirs.Now, imho, you really don’t want a BoD with people like you (PhD in Maths). You want a BoD with folks like some of the ones on AVC (@wmoug:disqus , @pointsandfigures, @JLM, @domainregistry:disqus @samedaydr:disqus @DonnaBrewingtonWhite and at least a dozen others).If you’re determined to get a BoD with PhDs in Maths, then go find the Quant Risk guys in the investment banks.However, having only Maths PhDs who can understand your work on your BoD increases the risks of an echo chamber where all of you are convinced the system is super-clever and mathematically right but it doesn’t translate to the market, users and investors who don’t have PhDs.Picasso’s ‘The Bull’ is a terrific masterclass in abstraction and making things that are simple and easy to understand whilst referencing things that are much more complex.So…do you need to explain complex maths to investors to make them understand and appreciate your work?
      7. sigmaalgebra Sep 7, 2015
        
        I’m big on simplicity and just pleasing the users. If I just wanted to do some math, then I have plenty of it I’d like to do and would and wouldn’t type in 80,000 lines of text for some software — given the math, all totally routine for me, not much more interesting than mowing the grass, and grass mowing is much better exercise.Your points are correct, but maybe you are assuming about me some stereotypes that are not true at all.E.g., yesterday athttp://avc.com/2015/09/reco…I tried to give a really simple explanation of some of what my work might mean to a person: Q. Suppose you do a Google/Bing search and get 10,000,000 results. What do you do then?A. Look at the first few!Q. Suppose in about five minutes I can get you the 1-2 dozen of the 10,000,000 you will like best? And I asked for feedback on this. There’s more that users can like, but the above point is just dirt simple. With that description, there’s no math at all, and the description is 100% focused on pleasing the user — it’s all about the user.For more on being simple, yesterday inhttp://avc.com/2015/09/reco…I wrote: The UI/UX is supposed to be really easy to use, say, even just an English language version by a five year old anywhere in the world. For your, Now, imho, you really don’t want a BoD with people like you (PhD in Maths). I’m terrified of reporting to a BoD that can’t understand the crucial, core applied math work that is the technological advantage and barrier to entry.Why? There are several reasons, but the most direct reason is that the company will need to do more, do more projects, internal applied research projects, applied math projects, e.g., for ad targeting, maybe for server farm security, reliability, and performance.So, likely the BoD will want budgets, annual and quarterly. Well, each significant new project will need funding from the budgets. Since the budgets will likely have to be approved by the BoD, they will, to be responsible and diligent, want to review the projects with budget requests.At the start, say, for more in ad targeting, they might ask: Q. (1) How long will the project take? (2) How much will it cost? (3) How will it work? (4) How much better will it work than just what is routine now? (5) What will be the financial gains for the ad revenue? For projects, traditional BoD members are used to asking such questions and getting solid answers. E.g., can take some plans for a building, small, medium, or large, from an architect to a general contractor and get such answers. Such a BoD member won’t want to spend even one second or one cent on a project without solid answers to such questions up front.At first, I may have no answers at all. It’s an applied research project; we are looking for the results, and we don’t have them yet.With such an answer, I stand to have a totally torqued BoD. Some members of the BoD will start making standard highly contemptuous comments about blue sky, fun and games, far out, smoking funny stuff, etc. The fact that I have a good track record in applied research and did the research for the company so far will be of no interest to the BoD because they know nothing of research, don’t want to, don’t respect it, and want nothing to do with it. That’s some of why I’m terrified to report to a BoD.So, I tell the BoD that we will set aside the project for now.Then I arrange my schedule to give me some quiet time, not all worn out, put my feet up, pop open a cold can of diet soda, and start thinking, e.g., about the data I’ve got, what more data I might reasonably be able to get, what we do know to date about how the ad business works, e.g., the ad networks, ComScore, Google Analytics, charge per thousand, charge per click, unique visitors per month, how the Web site software works and its software for displaying ads (will have to match up with that), etc.Then I’ll think about how to take the data, manipulate it, and get the data needed for ad targeting.So, the core of the work is the data manipulations. Those manipulations are necessarily mathematically something, understood or not, powerful or not. So, I will proceed mathematically. There first I’ll think broadly and mostly intuitively about what mathematical approaches might work; I’ll think about computational complexity, the challenge of the software, testing, monitoring, etc.Then, from history, after two weeks or so, I’ll have some intuitive ideas, make them relatively precise, formulate the new aspects as theorems, and, if I can, write out good proofs. I’ll keep working until I have good proofs.Then back to the BoD, the rest of project, and its budget: The BoD should want to know just why the project is now promising, in effect, a good investment of the free cash flow (treat the budget as an expense, that is from pre-tax instead of after-tax) of the company. Then the answer, the only answer there is so far, is the applied math I derived. The answer is some math. If I give a solid presentation, then there will be theorems and proofs. If I don’t give a solid presentation, the BoD will suspect that I’m just trying to fool them.Then the BoD won’t like that situation: They’ve never seen or even heard of any such thing ever before. Since there is little more frustrating than listening to a math lecture without the prerequisites, the BoD will feel like they are standing barefoot in ice water, undergoing a barbed wire enema and an unanesthetized upper molar root canal procedure, while sweating from a 10,000 Watt heat lamp. They will feel insulted, inadequate, humiliated, embarrassed, at risk of failing in their responsibilities, and conclude that I’m just trying to intimidate them.If they stay 10 seconds longer than the longest they can stand, which won’t be very long, they will rush to the restrooms soiling and ruining any carpet on their path.They won’t stand for it. They will resign from the BoD, get rid of me, one or the other or both or worse. It will be a disaster for the company, the BoD, the ad targeting, and me.They will gather at a local bar and commiserate and agree that they are in business, they’ve been successful in business, they know a lot about business, in all their time in business they’ve never seen anything like my math lecture for ad targeting (or whatever the subject was), they know that there are other ways to make money, and if they are still interested in my company then they will move, with great anger and determination, seeking retribution, to get the company on a business-like basis and have no more college professor type lectures, especially not on math or some such.Someone will say, “A well run business doesn’t waste time, money, or effort on research except in the case of running a patent shop or just getting luster to please the stockholders.”They will think that their job on the BoD is to make sure the company is managed well. So, as inhttp://avc.com/2015/06/loya…there is The manager does things right; the leader does the right thing. and they will hope to do the first and will neglect with contempt the second.They will want to take the existing business and just run it, e.g., as a cash cow.Over time, the business will need some enhancements, but the BoD will neglect those. The business will decline, maybe even rapidly, but the BoD regards that as just normal and to be expected and accepted. When the business is nearly dead, the BoD won’t think that the problem was that innovation stopped but just that an oil well went dry. Then the BoD will move on to other businesses.A larger lesson that covers this issue, really, innovation in business, is:Near the end of the movie Moneyball, from John W. Henry, owner of the Boston Red Sox, to Billy Beane, General Manager of the Oakland Athletics, at the end of Beane’s quite successful first season using the statistical lessons of Moneyball, there is the statement: I know you’re taking it in the teeth out there, but the first guy through the wall, he always gets bloody. Always.This is threatening not just a way of doing business, but in their minds, it’s threatening the game. Really, what it’s threatening is their livelihood, their jobs. It’s threatening the way that they do things.Every time that happens, whether it’s a government, a way of doing business, whatever it is, the people who are holding the reins, they have their hands on the switch, they go batshit crazy.I mean, anybody who’s not tearing their team down right now and rebuilding it using your model, they’re dinosaurs. They’ll be sitting on their ass on the sofa in October watching the Boston Red Sox win the World Series. So, my BoD will “go batshit crazy”.I will be out the door. The business will stagnate. The BoD will be nice and happy until they want to move on to something else and let my business finally die.No thanks.Did I mention that I’m terrified to report to a BoD?Again, this whole project and nearly everything about it is still close to “The Little Red Hen” in Mother Goose where no one would give as much as two cents or two seconds before the project has/had wild success in the hands of the thrilled users/customers. No one wants to see the work or the kitchen but just the results.But until the results, I will be 100% owner, whether I want to be or not.Then with the results, why be less than 100% owner? In particular, why go from 100% owner to 0% owner, with some chance of getting back to, maybe, 40-60% owner, on a four year vesting schedule during which time the BoD can fire me and leave me with essentially nothing?The point is not the stereotype that I want to play silly math games, intellectual self-abuse, on the dime of the company but, instead, that I don’t want a traditional BoD to mess up the core innovation and, thus, the main power and value, of the company because they’ve never seen such innovation before, can’t understand it, won’t approve budget items for it, in their experience know that there are other ways to make money, and more generally will have a wildly bitter emotional reaction to kick me out of my company. No thanks.Or the BoD and I, as soon as we try to work together productively, say, to get a budget approved, and in that process get to know each other, we will mix like oil and water; the BoD will be one of these two and I the other one; the BoD will coalesce together, agree that I’m worse than a barbed wire enema, just not a good business-like thinker, gang up on me, and boot me out. No thanks.Traditional BoD members will like me and my business when I can get a 200 foot yacht and invite them to a nice party on the yacht in Long Island Sound and not much before.
      8. Richard Sep 7, 2015
        
        What’s is your approach?
      9. sigmaalgebra Sep 7, 2015
        
        The math is proprietary, i.e., a trade secret. Of course, much of the math is classic although mostly advanced; some of the math is original (new theorems and proofs). The users will never be aware of the math.There’s more: The overall idea of the UI/UX is novel; users will understand that as soon as they use the search engine (each page has a link to some help particular to that page). The UI/UX is supposed to be really easy to use, say, even just an English language version by a five year old anywhere in the world. For now, I don’t want to explain that the UI/UX before alpha test.The mobile strategy is to have just a Web site with all the Web pages just 800 pixels wide, dirt simple layout, large fonts, high contrast colors, essentially no JavaScript, no cookies, no user IDs or logins, and, with horizontal scroll bars, usable with only 300 pixels of width.I intend to offer and request alpha testing and feedback from the AVC community.The overall objective I’ve touched on here; I’m sure the AVC community will catch on right away, heavily just from what I’ve said. Or, besides how to do it, what about the “it” — that is, do you see it as useful? Assuming my technology works as I intend, the real issue is not the math or the UI/UX but just what the site does for people.Thanks for your interest.
      10. Richard Sep 9, 2015
        
        Sounds a little like wildcard
      11. sigmaalgebra Sep 9, 2015
        
        I don’t know what “wildcard” is.I see nothing in the work for which wildcard could be a useful description.But what do you think of the utility of taking 5 minutes to find the dozen or so results you want out of what, otherwise, would be some milions?Now, the results you “want” are for just the search you are doing in those 5 minutes and may be totally different from what you would have wanted for another interest or even the same interest before or later.E.g., maybe you are a CFO and are looking for blogs on Sarbox and get a curation you like. Then right away you look again for Sarbox and get curations you believe will be more appropriate for the accounting group, and the two curations might be quite different from your doing different things in the 5 minutes.What do you think. If you don’t like it, say so, but then tell me why.
      12. Twain Twain Sep 9, 2015
        
        Remember a month back I said Apple is hiring in Machine Learning? It’s now being publicly reported:* http://www.macrumors.com/20…
    3. William Mougayar Sep 7, 2015
      
      Have u read this book by Markoff http://www.amazon.com/Machi…
      1. sigmaalgebra Sep 7, 2015
        
        Hadn’t heard of the book. Thanks for the reference.I just looked at the page at Amazon: Mostly I wouldn’t agree with his thesis: I believe that he exaggerates both what his robots can hope to do within the foreseeable future and also the dangers.The idea that computers are electronic brains goes way back in computer marketing, popular science, etc.E.g., a computer programmed to play a good game of chess still can’t play or even learn to play checkers.A self-driving car is basically driving on, say, electronic tracks and/or with a lot of simplifying assumptions about how traffic works. Well, traffic doesn’t always work that way. So, hint, change the roads to make them friendly for self-driving cars and, then, insist that people drive like the self-driving cars do. No thanks.Still, computers do just what they are told to do.For some computer systems, identify the circumstances in which they are to be used, thoroughly test them within those circumstances, and be sure the systems are not used otherwise.So far our computers and networks are vulnerable to problems in security, reliability, and performance, and the progress here has been so painfully slow that the threat of the robots taking over seems to be from Bugs Bunny or someone trying to hype stock or sell a book.
      2. Twain Twain Sep 7, 2015
        
        Blurb says: “Developers must now draw a bright line between what is human and what is machine, or risk upsetting the delicate balance between them.”Here’s the paradoxical and moral situation wrt that bright line…(1.) Keep the machines as they currently are, with limited intelligence solving narrow problems…=> We risk repetitions of global financial crisis.=> Machines incapable of understanding Natural Language properly.=> Machines have poor constructs of emotions and no related internal model for WHY they should care about us.As IBM Watson’s creator said about his own creation, it’s “akin to a human autistic savant”.On the one hand, some AI researchers argue that the absence of emotion is a good thing because it enables fact-based decision making rather than human biases and errors which are often caused by emotions.On the other hand, the absence of emotional comprehension in the machines means they wouldn’t feel or have any constructs of feelings, e.g. REMORSE, GUILT etc if they kill us.(2.) Create a General Intelligence model from scratch…=> We risk the machines becoming super-intelligent.=> Machines more capable of understanding Natural Language.=> Machines have better constructs of emotions and end up manipulating us, e.g. via even more targeted content.It’s not only the technical challenges of Machine Intelligence which isn’t trivial to think through, it’s also the moral and ethics issues.* http://www.cio.com/article/…
      3. James Ferguson @kWIQly Sep 7, 2015
        
        or many of the reassons above I agree with you and @wmoug:disqusBottom line machine learning in a human functional “wrapper” can be unbeatable.The human is often best placed to apply identified objective information – the machine is often better at the pre-cursors (filtering, data prep, map-reduce, and handling tedium) etc. but a human supervised presentation layer is often better at assessing the value and applicability of purely mechanistic outcomes. (That is until the mechanism becomes more sophisticated)
      4. Twain Twain Sep 7, 2015
        
        Thanks, James. Fresh out of university with my maths degree, the then President of the European Neural Networks Society, Professor John G. Taylor recruited me into a hedge fund (https://en.wikipedia.org/wi….It was my first exposure to Machine Learning applied to the real world — more than some theory in a book.There’s no question the tools of Data Science (Hadoop, MapReduce, Spark etc.) and APIs like Clarifai are all really useful.However, I look back to new-graduate me and how we did human judgement and supervision and compare it with today’s supervised and unsupervised Machine Learning and it’s clear…There’s still the same Big Black Hole of tools that are yet to be invented and data sets that are yet to be collected, to inform our human decision-making.
      5. Twain Twain Sep 7, 2015
        
        Also, here’s what Qualcomm presented at the ‘Imagination’ festival on bleeding-edge technology in April 2015. To mitigate against the machines becoming super-intelligent and doing harm to humans, Qualcomm is researching how to codify levels and limits of socialization into the robots.I’d argue the machines are only mathematically intelligent rather than also culturally intelligent (emotions, language, socialization, values).Interestingly, in genetics and neuroscience there’s research which identifies the female X-chromosome as being the key gene for cognition, intelligence, language and socialization.Guess what the hardest problem the predominantly male AI researchers haven’t been able to solve is?Natural Language understanding (please see Geoff Hinton from Google saying we need a whole new language model).To-date, the majority of Machine Intelligence and its frameworks have been based on and benefitted from male Y frameworks and Y codes for mathematical intelligence.For the cultural intelligence frameworks, these necessarily need more female X code and it’s a different form factor from Y code.To make the breakthroughs in language understanding that Machine Intelligence needs, the Charles Babbages of AI need to source modern-day Ada Lovelaces.This is WHY women in technology matter. We’ve been key to innovation breakthroughs from the Difference Engine to Wi-Fi to NASA being able to land the Apollo 11 on the moon.Now, we could leave the machines as they are: mathematically intelligent but autistic with no ability to read emotions or support moral+ethical decision-making.Or we can codify them with the X code of female intelligence and make them culturally intelligent too (emotions, language, socialization, values).I believe in X+Y so I designed for it and codified it into my systems invention long before I saw Qualcomm’s presentation in April 2015.
2. Joah Spearman Sep 6, 2015
  
  And humans understand trust and authenticity far more than data. Will be interesting to see how long that’s true.
3. LE Sep 6, 2015
  
  What is truly interesting is the amount of free labor that makes all of this possible. There were always people who helped out at no pay for things (primarily non profit causes) in the past, but I don’t remember many examples of free labor helping for profit businesses prior to the Internet. Some Amazon reviewers get free products but that doesn’t drive the majority of reviews and rankings.
  1. awaldstein Sep 7, 2015
    
    It’s always been like this actually LE.Atari and Creaf were built on the a massive community of enthusiasts. In the millions.They were the marketing team, the community outreach, the touchpoints at thousands of retail locations world wide.I know this for a fact cause i built these networks personally.Building brand without community is a hard slog. Both too hard and not as much fun.
4. CJ Sep 7, 2015
  
  Meet the new boss, same as the old one. Curation is king…again.
5. Douglas Crets Sep 8, 2015
  
  yeah, I am seeing this at the school where I work, where we are building machine learning + human curating to teach a range of skills in STEM and in the arts. More on this later. It’s very cool, and what makes it even better, it is happening in mobile Asia, basically Hong Kong and China.
Dave W Baldwin Sep 6, 2015

Good investment Fred.On the comparison of human vs. algorithm, machine learning is evolving. The missing ingredient in the bigger picture regarding next step upward will probably be introduced in the next 18-24 months which will go hand in hand with those that have best inventory listings related to online store rooms.
LIAD Sep 6, 2015

just think about the expansion in the gamut of data available on which to make recommendations.Old School Web: History/Bayesian InferenceNew School Mobile: Heart beat/Air pressure/Ambient Sound/ Accelerometer – just to name a few.A metric shit ton more signal and noiseMo Sensors. Mo Problems.
lisa hickey Sep 6, 2015

I love the line “unbundling of scale”! That’s great, and helpful framework for strategic planning for large and growing companies.Your post made me realize I have been unconsciously playing a game with all these recommenders—I usually ask myself, “are they right? why or why not?” For the NYTimes, for example, I noticed that a list of “customized just for you” articles that came in my inbox yesterday—and I had already read them all. So it was 100% correct, 0% useful. The gmail, “who do I send to” on the other hand—is always useful to me but often wrong— because I send a lot of information out to groups on a need to know basis, but try to be conscious about giving people information they don’t need or distracting them from tasks. So for gmail—it’s pattern seems to be based on groups I usually send too, and it’s right only 50% of the time. I’m just not that predictable. But it has often helped me from leaving people off who should have been included.The other thing that is of interest to me is predicting future behavior based on current choices. I remember hearing about a job site that would ask people “are you looking for a job now?” The person might say “no” but the site’s algorithm would notice the person reading articles about being unhappy at work, or bad bosses…and so they could predict the person was looking for a job before the person knew it themselves. And so the merging of data with the actual psychology and sociology of human behavior—especially predictive behavior—would be an interesting approach to expand recommendation engines. Get the data, but then layer in the “think like a human” part of it—as many of these comments are suggesting.Finally, this seems to also go with the mega-consumer need of “too much information” and the difference between specialized information that you make it easy and intuitive for the user to find vs. information that you push out to them because you believe it will be interesting. As the algorithms get more advanced, I’d like to see more work done on the former. You could still customize your information—just don’t push it to me unless I want it. Pretty soon there will be “too many recommendations”, even if they are well thought out.
Twain Twain Sep 6, 2015

“It almost seems like recommenders are table stakes these days. You can’t even play in the game unless you can do this sort of thing. “Indeedy…HOWEVER… Is it the case that the big legacy and “new incumbents” have too much of an advantage and, therefore, the recommender space is too saturated for new disrupters? After all, they already have all the data, the servers and the quant tools etc, right?Well….(1.) The machine vision pieces, which the likes of Clarifai do, are on their way to being solved — although there are still known issues related to shape recognition, surface texture extraction and compositional context of objects.* http://www.wired.com/2015/0…(2.) For sure, the understanding Natural Language part of Machine Intelligence and Data Science remains a BLANK CANVAS & BIG BLACK HOLE.One of the “Fathers of Deep Learning”, Geoff Hinton of Google, shared this in October 2014:”If the computers could understand what we’re saying…We need a far more sophisticated language understanding model that understands what the sentence means. And we’re still a very long way from having that.”Meanwhile, Alison Gopnik of USC Berkeley, noted:”When we started out (in AI) we thought that things like chess or mathematics or logic, those were going to be the things that were really hard…Not that hard! I mean, we can end up with a machine that can actually do chess as well as a Grandmaster can play chess.The things that we thought were going to be easy – like understanding language – those things have turned out to be incredibly hard. Those are the great revolutions (understanding language) – not just when we fiddle with what we already know but when we discover something new and completely unexpected.”There are a number of new experiments to try to solve this problem, including Google Voice and MS Cortana and Amazon Echo recording all the words we say to do probabilistic matching and interpretation of it.************HOWEVER, THERE’S STILL A BIG BLACK HOLE of tools and data yet to be collected. Certainly, the existing databases and semantic structures…CAN’T SOLVE THE PROBLEM.The new form factor for Recommenders will be very very different from how they’ve been frameworked and built for the last several hundred years.Once this new form factor ships…The level of intelligence for the machines will be factorially improved, quantum-fold — in both machine vision and machine translation of Natural Language.And, subsequently, impact on informing our decision-making and economic modeling.It’s not a trivial problem to solve, though. It’s about as NASA / Quantum Information hard as can be.And it needs INVENTORS & ARTISTS.
1. Stephen Voris Sep 6, 2015
  
  Big data could be great for getting computers to understand language – but if you’re not recording the right stuff, the “garbage in, garbage out” saying applies. And when it comes to English, well, there’s information being lost in the mere transcription from spoken to written – and that’s assuming everything else (including word order, paragraphs, and punctuation) is recorded perfectly! Just look at the myriad cases where people have misunderstood each other in text based on differences in inferred tone (see also: why ‘lawyer’ is an entire profession).There may also be issues of premature optimization – consider that it takes your average human a couple of years to even start putting words together intelligibly. In a sense, they’re compiling their own dataset all that time, associating sounds (particularly parental sounds) with visual stimuli – and after two years the result, however adorable to the parents, isn’t going to be winning a Pulitzer just yet. So, going back to machine learning, maybe aiming directly for adult-level reading comprehension is the wrong metric to use right now (though eventually we probably do want to go there, and perhaps beyond).
  1. Twain Twain Sep 6, 2015
    
    Well, it’s a good thing I made my system as easy as ABC to use because this was the audience I had in mind from the start.
  2. SubstrateUndertow Sep 6, 2015
    
    “there’s information being lost in the mere transcription from spoken to written . . / . . differences in inferred tone”add to that the missing organically intertwined subjective inference layers that are lost when the skeleton that is natural language is striped away from its holographic sensate-life-data integration which embodies the key link between natural language and human cognition.I thing this is a good example of “non trivial causal spread” complexity that will be very hard to mimic with machine learning.
    1. Twain Twain Sep 6, 2015
      
      Embodied integration is the key.Last month, Princeton neuroscientist Michael Graziano wrote a piece proposing an internal “Attention Schema” which the Deep Learning models, to-date, have been missing.* http://aeon.co/magazine/psy…So then…some people will say, “Hey, the robots (and let’s include Viv/SIRI, Amazon Echo, MS Cortana, Google Voice, Jibo et al) are embodied technologies, learning and recording our speech and words. They understand us. They can do recommendations, so what’s the problem?”Hmmn…well…SUPERPOSITION OF OBJECTIVE+SUBJECTIVE associative layers which no AI researcher has been able to work out (yet). That’s why Natural Language meaning understanding remains such as challenge.Not a trivial problem to solve because not even the Quantum Physicists have a workable model for subjectivity.In fact, Max Tegmark of MIT only proposed “Perceptronium as the most general substance that can feel subjectively self-aware” in 2014, and he makes it clear there isn’t a mathematical proof yet.The knot in the problem being somewhere in these mathematical equations.Yes and don’t imagine that Probability and its methods are the panacea tool for coherently measuring and modeling subjectivity.If it was, stack ranking would work well and it doesn’t (see Vanity Fair article on ‘How Microsoft lost its Mojo’ and the section “Bell Curve”):* http://www.vanityfair.com/n…
2. Matt A. Myers Sep 6, 2015
  
  There’s the illusion of curation and quality as well that floats out there — it works until you lose trust.
Twain Twain Sep 6, 2015

I have a reasonable amount of hands-on context about what Data Science, Recommenders and Machine Learning can / can’t do.I first did Recommender algorithms in my maths degree and then in Econometrics where I pulled in data on the Tiger Economies to do prediction models of GDP.Fun stuff. Get data. Scrub it. Model it. Sanity-check that it’s statistically au fait with a range of tests.Got A’s for it and a comment from my professor about how my prediction was different from the prevailing models from bank analysts at the time.Well…time proved my prediction model to be more accurate than theirs.Then, in a hedge fund, we used 5+ Machine Learning models to do asset allocation. One of our clients was the world’s largest sovereign fund.Then, in the startup and in banking, a lot of my time involved data, data products and more data.So I have experienced-based views on where it’s structural weaknesses are, WHERE DATA INTELLIGENCE NEEDS TO GO & HOW TO GET THERE.Not easy to do but, funnily enough, almost already done.
1. Richard Sep 6, 2015
  
  What’s was your econometric GDP model?
  1. Twain Twain Sep 6, 2015
    
    It was on whether Hong Kong, S. Korea, Singapore and Japan (I used this instead of Taiwan) would continue to have the GDP growth rates they’d enjoyed for the previous 5 years.A lot of the standard data inputs — interest rates, sectorial breakdowns of output, time lags on supply chain etc — can be pulled directly from the stock exchanges and other data sources (venture capital, market research, strategy consultancy databases).However, as with most economic models, it’s the context of the qualitative factors that help the modeler choose which quantitative variables to use and the weighting assigned to those quant variables.I had qualitative context on legislative changes that were happening in HK and Japan so I factored that in as a compounding discount rate whilst the prevailing bank analysts didn’t. They took a classical approach based purely on historical time series data.The tools and the data sets that the modeler chooses depends on what they’re trying to find or the economic view they’re trying to advocate.It’s as you alluded to in a previous thread: “Which type of cancer?”I don’t know where I’d begin with mammography and other cancer detection data sets, though. Periodically, I read medical research papers and I could see how Machine Learning might be applied to the data sets.However, since I don’t have hands-on experience dealing with such data, there’s a knowledge gap. There are also data sets which have yet to be collected in that analytics space too, related to a patient’s emotional chemical state which may / may not affect the cancer as well as how their body responds to medication.The data that’s missing and still needs to be collected for Natural Language understanding and economic modeling is something I can build systems for.The cancer problem, less so.In both, the latent entropy variables matter. They’re likely a source of causation or may be some type of catalyst trigger.These are non-trivial problems to solve and there are a lot of people cleverer than I am in the medical profession trying to model and understand cancer, our brains, immune systems etc.
    1. Richard Sep 7, 2015
      
      But did you check to see if your data was jointly stationary?
      1. Twain Twain Sep 7, 2015
        
        Cyclostationary. The probability distributions for each country varies periodically over time and they’re not lock-step autocorrelated in mean or variance.
Ana Milicevic Sep 6, 2015

One thing I find very interesting is how much scale you need so that machine learning is actually additive. For example, in advertising technology you’ll need to see hundreds of millions of users/impressions before you can truly gain advantage from ML; I wonder if the same stands for, say, diagnostics. In that sense, the scale advantage becomes a speed to market advantage more than anything else. If you’re smaller, you could have better algorithms but you won’t as easily see as many inputs (users, impressions, tissue samples, etc – whatever applies to you) to make it truly worth it.
howardlindzon Sep 6, 2015

our spark app will be unbundling this for price volume and even social as it relates to stocks and markets and scans…dead on albert
howardlindzon Sep 6, 2015

I wrote up unbundling markets http://www.howardlindzon.co…
Joah Spearman Sep 6, 2015

With all the focus on data, there’s a massive opportunity to come around to the user through something they truly feel and recognize and something that competitors in these recommendation-industries have deemed too difficult to measure: authenticity.
Alejandro Cosentino Sep 6, 2015

Although it´s a little different, recommendations in Online Marketplace Lending space are loans published by each company. Then every lender checks his/her preferences based on the risk pricing analysis they want.There are external APIs that play as a lending robots role to process those preferences. Offer or not offer that lending robot service to lenders is one the questions that Online Marketplace must answer. Contingencies are not simple to manage in the FinTech space.
LE Sep 6, 2015

I think there is another level to this that perhaps nobody has tackled yet.What if when I went to a website instead of giving me banner ads they were able to customize the site according to my hot buttons. The things that mattered to me? This of course happens with ads all of the time. I look at something on Amazon or elsewhere and all of the sudden I see ads for those products on other sites I visit.Anyway, if a cruise line already knows (from cookies) that I am a parent with kids age 10 to 13 the actual content on a particular page (not the ads) would focus on stressing the kids club. Instead of, say, the teen rock climbing wall or the dining experience on that ship. And if I visited a page within that site (say wanting to know about off ship fun like sailing) they would re-stress that on other pages on the site as content.I just got a spam from Royal Caribbean. It caught my attention since the subject said “the Fastest Internet at Sea”. That’s one of my hot buttons, not having good internet (which I need) when traveling. So just on that I opened the email and it stressed the point again in the copy. This was an email and I have no clue if they simply noted when I was at their site that I cared about the Internet (this was perhaps 3 or 4 months ago) or they got the info elsewhere. But what if when I visit their site they do the same thing? They make the features and benefits that they have found I care about prominent instead of serving the same static site to everyone who visits. Somebody must be doing this now, right? Not talking about ads, I am talking about targeted marketing copy as well as graphics….
Alex Iskold Sep 6, 2015

This reminded me of our work at AdaptiveBlue 6 or 7 years ago. Lots of these themes we talked about are resurfacing.This time it feels like it will actually work and the main reason is that mobile is such a wonderfully constraint environment. A lot less space will lead to less choices and more personalization and more importantly interactivity.We have several companies in the incoming Techstars class tackling this space and would be interesting to see how it’s going to unfold.
Dries Buytaert Sep 7, 2015

I’ve been calling this the ‘Big Reverse of the Web’ (http://buytaert.net/the-big… the transformation of a pull-web to a push-web. The current web is “pull-based”, meaning we visit websites or download mobile applications. The future of the web is “push-based”, meaning the web will be coming to us. In the next 10 years, we will witness a transformation from a pull-based web to a push-based web. When this “Big Reverse” is complete, the web will disappear into the background much like our electricity or water supply. Content, products and services will find you, rather than you having to find them.
James Ferguson @kWIQly Sep 7, 2015

Optimisations -I disagree partly with > in favor of the larger and more mature companiesWhy?A lot of machine learning heuristics (particularly in pattern recognition) are developed using deep domain knowledge.This means that in specialist niches (particularly in B2B) a very small but very specialised experienced team pays off.So mature yes – whereas large – not so much (the benefit of large is credibility and market access but rarely laser focussed quality – which is often the missing ingredient in recommenders)FWIW – our whole proposition revolves around recommending priority action to “treat” bad energy data by making underlying systems more efficient. The biggest wins arise from are scalable domain knowledge and resultant productivity – needless to say – it is better to be world class when it comes to domain knowledge than merely big .
Douglas Crets Sep 8, 2015

I live for the unbundling of scale, and I probably have a career because of it. Kudos.