Why Sound Will Be Bigger Than Video

Alex Ljung gave a talk at LeWeb 2011 in which he argued that sound will ultimately be bigger than video on the Internet. I am a big fan of contrarian thinking and argument and Alex does that very well in this talk. It's about 15 minutes long (there's another 8 minutes at the end of this video that is not related to Alex' talk). If you can find the time to watch it, do that and then let's discuss his arguments in the comments.

Disclosure: Alex is the founder of SoundCloud, which is a USV portfolio company.

#Web/Tech

Comments (Archived):

  1. Dave Morgan

    I agree that sound has a chance to be bigger than video in the mid-term growth of web services, if for no other reason than the simplicity that Alex talks about, but I believe that sight, sound & motion will pass it quickly as services become more robust and the complexity becomes less of a barrier.

    1. Ela Madej

      What do you mean by “Sight, sound & motion will pass it quickly as services become more robust and the complexity becomes less of a barrier.”? Thanks for elaborating on that one!

      1. Dave Morgan

        By sight, sound and motion, I mean video. It’s a much more robust sensory experience than just sound alone. Both people and devices will have more to interact and respond to with video than sound. Today, it is simpler to build services around sound, more two dimensional. In the future, it will be simple with video.

        1. Ela Madej

          makes sense, thanks for taking time to reply Dave!

  2. Mark Essel

    The blending of audio with other activities acts an interface between the net and the world we live in (when not staring at a screen).

  3. Jan Schultink

    Sound will be the first media segment that will go 100% web. It will be bigger than traditional audio because of new listening occasions (on-the-go, a rebirth of car audio).In the long-term, I feel that the $-market for web video entertainment will be bigger.

    1. spaceCat70

      100% web audio? that’s a scary thought. makes me think of 128kbps mp3 streams all under the thumb of RIAA etc. there will always be purists who will stick to vinyl, downloading FLACs, etc. There will be enough people who are passionate about music that it’ll never be 100% web. At least I hope.

      1. Jan Schultink

        OK

    2. Dale Emmons

      I agree. Much of his argument rests on the fact that sound is easier and more intuitive to create because the tools are better and therefore sound will be the winner. That’s a supply-side argument.Internet and mobile audio tools are currently more mature than video tools, but not for the reasons he argues. Audio creation tools are technically and computationally easier to make and use, so they’ve been the first to mature on the web and mobile devices.We’ve seen this movie before though. In the early 90s, audio editing came to consumer level desktops first. Not because there was more demand than for video tools, but because hardware constraints didn’t allow for video yet. Desktop video tools followed about 7 years later and today is a far larger market.The same will happen on the internet and mobile devices.Sound of course will have an important place, but human beings are inherently visual creatures. We consume media more easily when it’s video because that’s how our brains work. So just as the case of desktop software, or radio vs. television, internet video will continue to be the larger market because that’s what consumers will continue to demand.Disclaimer: I’m cofounder of an internet video tools company.

  4. Ela Madej

    Saw this talk live at LeWeb. Alex rocks. The only piece of feedback I gave him after the talk was that the first point he made wasn’t good so not the best opening (the “one click” one, it’s the same with video so no real advantage). So, at first I asked myself “hmm, I am not sure where this is going” but then he totally bought me. Great talk, great arguments, great constructions of the talk, great energy, great product.(sorry for being uber uber-enthusiastic but I just recommend everyone to see it)!

  5. Nick Tomaino

    I had no idea I Am T-Pain was so big!! Well crafted argument by Alex. Great stuff.

  6. Wells Baum

    Good arguments (sound is emotional, sound is parallel, sound is one click away, sound is more than just music).I’ve used SoundCloud to get audio recommendations from co-workers, to host a Music 4 Japan compilation in which a majority of the tracks came from Musicians that posted on SoundCloud, and to record behind the scenes celebrity interviews in my job. SoundCloud is a useful and engaging tool.But we should use the tools we think will have the biggest impact. Do you think that little girl singing Nicki Minaj “Superbass” would’ve made the Ellen show without a visual (http://www.youtube.com/watc

    1. Wells Baum

      Comments expanded:        Unmuting the web with sound is another tool, not the predominant tool now nor in the future, for sharing entertainment and experiences.  http://www.bombtune.com/201…

  7. jason wright

    Sound is the first experience of life – mother’s heart beating and her muffled voice speaking. 

  8. Alex R

    Does the “life in parallel” idea undermine the concept that sound can be a good ad-supported medium?  If you’re multi-tasking while you listen, how valuable is your ad impression?  

    1. fredwilson

      I do recall the ads I hear while drivingProbably the same with the ads I hear while doing email

      1. Alex R

        True… but can those ads be as valuable as video ads, if they aren’t as disruptive and/or engaging?Maybe audio ads are low-cost/high-volume, while video (in comparison) is high-cost/low-volume 

        1. LE

          Listening and getting impact from an audio ad (using radio as an example) is much cheaper (run time and production) and easier compared to video. Video is captive (unless it’s the TV ad that you hear while cooking dinner which is really just audio delivered on TV). With an ad you are only trying to get a point or two across in a short amount of time. So it than can be done simultaneously with another activity because you don’t have to pay attention and get frustrated if you miss something. “Ford’s having their year end sale act now!” And of course in the car listening to radio is even better since you are essentially looking for something to occupy your time while you are driving. How many people have phoned someone and had a long conversation in the car that they would have never had the patience or interest to do if they were sitting at a desk, or in their house, with other *better* things to do? 

          1. goldwerger

            Very good points all. I would like to make a few points on the immense value of audio content and audio advertising from the vantage point of an industry insider:- Online audio consumption is shown to be extensive and (very) immersive. Based on research we have published earlier this year with the involvement of the IAB/RAB (http://bit.ly/fdQl0B), the typical Internet Radio listener listens between 1 to 3 hours at a time(!). In comparison, a video user consuming 5 minutes of video, or a web user spending 1 minute on a web page, would be considered engaged. What a gap- When you view a web page, many things compete for your attention. Multiple content segments, multiple ads, multiple… everything, on- and off- line. When you listen, you can generally only listen to one thing at a time. That is why audio is the most valuable content category – you can parallel process with it, but when added, your audio senses are fully engaged and captive, with one main audio content at a time, serially. And, when you hear an ad, that one ad is getting your full and undivided auditory attention. You have only one pair of ears, which listen one at a time- Audio is unique. That’s why traditional radio commercials have worked for years. And why digital audio advertising works as well and better (given superior targeting). Some things about audio messaging simply don’t exit in other media – deep emotional impact, subliminal message retention, etc.Alex Ljung has got it dead on. This medium is only beginning to show the full potential it actually has. The scope of content creation, innovation in business models for delivery of digital audio content, and digital audio advertising – all are in their stages of infancy – in what we are convinced will be a very, very, big market opportunity to all market participants.Eyal GoldwergerCEO, TargetSpot

    2. leigh

      interesting point…. the technology to track ads in the living room is there – but Neilson hasn’t deployed it yet (oh big wonder there)…. they are apparently measuring with sound actually — pple are wearing devices that can listen for the programming — as long as the device hears the program, it counts it as an impression.  

      1. ShanaC

        Actually I do wonder.  Why not sell that data, that could be awesome?

  9. Oo Nwoye - @OoTheNigerian

    Hey Fred,Greetings from Lagos!The ability for parallel creation and consumption is the single biggest reason sound will become really big. However, lowering the barrier to create and distribute sound creates another problem; which in itself is an opportunity.Problem: Noise (literally)Opportunity: Curation.

    1. Dave Pinsen

      Did you see that the first AVC meetup in Africa is scheduled for this week?

      1. Oo Nwoye - @OoTheNigerian

        Em.. Africa is kinda big :). What city is it in particular? Or is it an online meetup?

        1. Dave Pinsen

          I am aware of the size of the continent. I mentioned it because a year or two ago we had discussed the prospect of you hosting one in Nigeria. I’m viewing the mobile version of this blog now, so I don’t think I can see the list of meetups, but if memory serves it was going to be in Kenya somewhere.

          1. Oo Nwoye - @OoTheNigerian

            Thanks man.I think I remember the conversation. I searched for the meetup anywhere but cannot see it. Kenya is quite ahead in terms of a structured tech community. Nigeria has quite a while to go. That is what I am working on at home.

  10. Dave Pinsen

    To make the case for why sound will be bigger than video, you post a video clip.

    1. Mark Essel

      Hahah.But it brings up a not so subtle limitation of our avc community. We’re not in a position to enjoy what soundcloud offers here too often, unless Disqus integrates soundcloud support.Audio comments/shares

      1. William Mougayar

        Yup. We discussed the pros and cons of sound comments a few months ago. I remember well, as i may started it. I still believe that if done in moderation it might be a good thing, especially in the threads where there is good conversational back and forth. It could be done as a text-to-voice translation. 

        1. Mark Essel

          Perhaps a transition to a synchronous Google+ hangout or voice chat that get’s saved as a soundcloud audio converted to text to optimize accessibility

        2. FAKE GRIMLOCK

          THERE REASON EVERYTHING BUT GAMES ON WEB ELIMINATE SOUND.WEB IS VISUAL CHANNEL. MOST USERS DOING SOMETHING ELSE WITH AUDIO CHANNEL WHILE ON WEB. MUSIC, TALK TO COWORKER, ETC. 

          1. ErikSchwartz

            Internet and web are not synonymous.I agree about the web being silent except games.I disagree about the internet being silent except games.

          2. FAKE GRIMLOCK

            US AGREE.LISTEN TO MUSIC ON INTERNET, ETC.IF LISTEN PRIMARY GOAL, MUST HAVE AUDIO.IF READ ARTICLE, COMMENTS, SPAMMED WITH PERSON TALKING VERY BAD.

          3. ErikSchwartz

            I am always happy to agree with beings that have the power to eat me.

          4. awaldstein

            I rarely think about this distinction as it relates to how we use them.Great post that someone should write.Internet vs web and the behaviors that are natural if different on each one.

          5. LE

            True. If sound is not a positive and primary it becomes a negative by interfering with the audio channel being used. (I just put in ear plugs because they are leaf blowing outside..)

          6. FAKE GRIMLOCK

            IT LIKE READ BOOK.NO ONE WANT TAPE OF AUTHOR COMMENTARY START PLAYING WHILE TRY TO READ WHAT AUTHOR WRITE.

          7. fredwilson

            that’s alex’ point about parallel processing

    2. fredwilson

      I know. I mentioned that to Alex when I first watched this last weekend

      1. Gregory Magarshak

        Yeah, it’s a bit ironic. The place where sound shines actually is annotating existing things (photos, comments, blogs etc.) and it’s easy to record quickly.(shameless plug follows, that I think might be apropos)I’ve been involved with a startup (qwips.com) that has raised several million in funding, and got a lot of brands / celebrities on board. (Well, by involved, I mean that my team has built their entire website.) Right now they released an iPhone app, and working on Android version.Personally, I didn’t get it at first either. I am still somewhat skeptical, but they are really starting to get traction, with MC Hammer and others spontaneously using it. Check it out — Qwips.com . 

        1. Gregory Magarshak

          LOL just realized you said SoundCloud is a USV portfolio company. So no chance you would actually be interested in Qwips 🙂

      2. JD Salini

        The problem with Soundcloud is that it always jams up. It’s very frustrating.I like the protected streams and using it for sending music to media, but everyone at press hates it.

        1. fredwilson

          what do you mean “jams up”?

    3. jmorf

      The video clip is an advertisement for how awesome the sound is 😉

    4. William Mougayar

      I listened to it without the video…was multi-tasking…

      1. ShanaC

        me too, though i think his diction made it less powerful

      2. maverickny

        Same here, enjoyed it too.Interestingly, last year I did some short audio podcastlets while this year I experimented with videos – both sets were around 3-8 mins.The winner? Audio by several 1000. I think busy people like to multitask, so audio is more convenient in their workflow.

  11. giffc

    I’m not sure that the competition between mediums even matters. There is an explosion of creation and consumption happening in all realms. Alex gives a good talk, but a few devil’s advocate thoughts popped into my head:- when people go to a non-music website with sound enabled, the first thing they usually do is turn off the sound- you can multi-task with non-music audio when you drive or jog, but I’m not convinced that you can do this effectively when you are trying to accomplish something online or on your computer- if you post a sound recording of a talk, and a video of that same talk, which will get clicked on more? and which will have longer engagement? In an online experience, I would bet on the video- you’ve been able to UGC audio (of any kind) fairly easily for years, but there isn’t a UGC sound site with the same size, impact and virality as youtube, at least not in public perception. So what is changing the equation now?

    1. Oo Nwoye - @OoTheNigerian

      Devils, devil-advocate here :)..”when people go to a non-music website with sound enabled, the first thing they usually do is turn off the sound”Because that type of sound is intrusive. Same as when I go to watch a video and I see text. and you mean “non audio”. “you can multi-task with non-music audio when you drive or jog, but I’m not convinced that you can do this effectively when you are trying to accomplish something online or on your computer”It depends on several factors and what you are trying to accomplish, type of sound, etc. I do not like sound when I am trying to understand text for some people it is otherwise. My buddy Joel ALWAYS codes listening to drum and bass”if you post a sound recording of a talk, and a video of that same talk, which will get clicked on more? and which will have longer engagement? In an online experience, I would bet on the video”Then again, it depends. If it is descriptive, yup, video wins.”you’ve been able to UGC audio (of any kind) fairly easily for years, but there isn’t a UGC sound site with the same size, impact and virality as youtube, at least not in public perception. So what is changing the equation now?”Soundcloud. That is their aim. 😉

      1. giffc

        Nope, I meant “non-music” for all of those examples because Alex posits that sound is far bigger than just music.I code to dubstep, but I couldn’t do it to an audiobook or podcast.There are workplaces that leave talk radio on all day, however, like a car workshop. It is possible when you are doing tasks that don’t require constant critical thinking, and yes that can be on the computer, like creating production graphics.

        1. Morgan Warstler

          “I code to dubstep”  should be on a t-shirt.Infact, “I code to x,y,z” seems like a t-shirt line.

          1. PHP Developer

            I code to downtempo here, but dome DnB is good now and then 😉

          2. Hu Man

            I turned the volume down on it but it didn’t make the “emotion” go away. I’m only Hu Man.

        2. PHP Developer

          Actually never heard of dubstep.  Listening to dubstep tag on Last.fm now, tnx 😉

          1. Kevin Pillow

            Try Skrillex on for size, its the easy listening of the DubStep world…lol

          2. jonathanjaeger

            Well in a way it’s less “easy listening” due to his hard wobble sound, but I still love it.

          3. ShanaC

            prefer deadmau5!

    2. fredwilson

      Alex and I are betting that soundcloud changes that. We will see

    3. jmorf

      “when people go to a non-music website with sound enabled, the first thing they usually do is turn off the sound”I believe what you are saying is that people turn off sound what it occurs without their consent or expectation. I can’t say I think this is any different for video, I don’t like auto-playing videos any more than I like auto-playing MP3s.

  12. Morgan Warstler

    Most of the web will be replaced by video.We’ve not even come close to what video can do.  In explaining almost any idea, video better explains the subject matter to the most people in the shortest amount of time.Even when you judge by the weight of data, the amount of info that you can write with the highest fidelity to another brain – video is more effective.I expect that reading in school will drift off over the next 20 years.  Yes you’ll read the words ont he video screen, but that’s about it.

    1. benfon

      Video is better for some things – but hardly for everything. Trying to learn a programming language – a video does very poorly for that. Even learning a new recipe for a meal – video sometimes make be better (eg how to dice an onion properly), but not always (a text recipe will often impart the knowledge far faster than any video will if you already know the basic techniques).I actually find the trend to video explanations on the web to be quite tiresome. It seems to particular be used with all these MVP startups that have a single page with a title and the video – I don’t want to watch a 2 minute video – I want to read a one sentence description of what your site does, if you don’t have that – I head off.Or the other trend of video interviews – to me, that’s just laziness of actually doing any editing and producing a concise dense interview (given most video interviews are not edited and are just a streaming capture of a discussion). I can read a interview transcript far faster that watching the video – and the video seldom imparts any additional info above and beyond the transcript.

      1. Morgan Warstler

        Example: any recipe can be done in less than 60 seconds.The impact of writing the actual technique described to a new brain, one that doesn’t know / remember how to poach counts heavily.Meanwhile, those who really know all the techniques, quickly remember their recipes, quickly start to make their own, and static text recipes as as speed bumps for all but a select few.

      2. Rohan

        Agree completely about video interviews..

      3. Jake Ludington

        One key thing to keep in mind regarding text being more efficient than video or audio is the difference in individual learning styles. Some people get maximum comprehension from text, some people get more from hearing what is said, and there is a subset of those people who maximize comprehension from watching a face as the words are spoken.There are obviously other factors that go into human learning and comprehension.

  13. giffc

    It is an awesome service 🙂

  14. Harry DeMott

    Sure I pressed play on the video player – but then proceeded to listen to the talk while parallel processing (cleaning up my physical mail!) – so for many things – sound is great as an adjunct to something else – but I’m not sure we ever get to the point where sound is bigger than video (and BTW: what’s the definition of bigger? larger files? more important? more pervasive?) A well thought out argument, but sounds tends to be a fairly lean back passive medium. tough to get really big that way.

  15. josscrowcroft

    I love it, kinda awkward presentation and tough crowd, but he makes some very good points – I especially like this idea of unmuting the web. We’re gonna bring back a noisy website revival! Auto play background music was just the start.

  16. Branden Williams

    I like this concept. I loved the concept of Unmuting the Web. My thoughts:The world really cannot be enjoyed in parallel. As much as we try to make our brains do it, it is still a single CPU that has to use timeslicing (just like your computer) and scheduling to fake processing more than one thing at a time. I’m an avid reader, and while sound can be breathtaking (such as a beautiful orchestral recording), I can process words faster than sounds. If Alex were to provide a transcript, I would have gotten the same level of enjoyment (notice there were not specific sounds other than his voice) reading it and have saved myself at least ten minutes. To me, I view my role in making myself more efficient is scheduling better. I will be more efficient reading than listening to a podcast because my brain can quickly pick out the relevant parts.

    1. Branden Williams

      Thinking about this (hours later), I would point to the rise of SMS over calling someone.

  17. Jonathan Whistman

    I kept thinking of radio.  It’s been around a long time.  Books on tape, mp3. I guess I’m not getting the point of saying sound will be bigger than video other than to have a controversial heading to a talk. Thats not to say I don’t get the concept of SoundCloud and the role it’s playing for a certain audience. I was hoping to learn more of the lessons being learned around the collaboration and sharing with sound minus the visual.  But even SC adds a visual waveform.

  18. Carlos Leyva

    I am not sure whether “sound will be bigger than video” BUT I have been doing  a lot of recording lately for training and other reasons and sound is a much simpler “grammar” to produce in than video. By simpler I mean “more economical” (i.e. satisfactory results in a much short period of time). It’s not so much that you can’t “point and click (now)” to create video, it’s the quality of what gets produced in a fixed period of time is better with sound (for me).BTW, although I was in tech before practicing law, I am now just “Joe User” so this isn’t a tech argument, just an anecdotal observation.

  19. jmorf

    As Harry already mentioned, I am listening to Alex’s talk as well. I’m not watching it. I spend a lot of my day staring at a glossy (sometimes matte!) screen, I don’t want to watch video all the time.Alex is a great speaker, so congratulations and thanks for putting on a great talk, one of the best I’ve seen from Le Web.Sound IS bigger than video, and it’s just going to get bigger.Video without sound has little traction, uptake, or often even meaning. Alex’s point about turning off the sound in a horror film is an excellent illustration of this. Ever clicked on a YouTube video, realized there was no sound and turned it off? I have. YouTube is not just about video, it’s about video accompanied by sound.Video is often just a way to get people to listen to sound: TED Talks, Le Web Talks, review videos on YouTube, non-profit pitches on Vimeo, etc! When I watch a video review, I minimize the video. The video in a Le Web talk or a review video on YouTube is fulfilling an addictive compulsion to watch a screen, and make us more likely to listen to the sound. So much of the growth of online video is relatively meaningless imagery attached to sound to make it appeal to our great love of (read “addiction to”) screens.Why is sound bigger than video already? Because sound is iTunes + YouTube + Vimeo + Google Music + Hulu, etc. etc. etc.So if it is just the dawn of video, well then Alex is right and sound is just going to get bigger.

  20. Matthew Moore

    Audio is already bigger than video. Just ask yourself how many videos do you watch on Youtube with no audio track. Would you rather listen to the audio from Alex’s talk or just a straight, silent, video track?I recall a study where TV viewers would change the channel far more quickly if the audio track was distorted than if the video were.Video vs. audio just isn’t a fair comparison because video is audio.

  21. Yenkta

    As I watched the video, I find it very cool. I think it may work very well in india(I am from India) if anybody comes with good apps on consumption side.

    1. Dave Haynes

      Any particular reason that you think it would work well in India vs another country? It would be interesting to know if different cultures interacted with sound in different ways.

  22. Mike Rowan

    I watched this via live stream when I had only wished I could have been at Le Web.  I have to say, I loved this talk and agree with just about everything he said.  Sure, I am bias because I am personally focused on bringing sound and more specifically voice back into the non-vocal social world we live it.  I think it’s safe to say that sound in general appeals to a much larger audience across entirely different engagement models from music to talk radio.  It can be consumed in more ways, and actually make you more productive while consuming while consuming it.  And more importantly, a wider audience of people can create content using sound, if given the right tools.None of my points touch on what we all miss out on when communicating today in the social eco-systems we live in (myself included) when we don’t leverage sound, such as our voices.  Don’t get me wrong, I am writing this long winded comment, and everyday I tweet & don’t use my voice like the rest of you.  But, there is no denying that voice has gone missing in social media and I think going forward we will see big changes with sound, especially in the social app eco-systems that are part of most all our daily routines.  Take voice for example, it hasn’t been included in the daily social tools we use not because it isn’t a valuable communication vehicle, but because the preferred timing & delivery of how we want to consume content in general has changed.  We need to find ways to fit sound and it’s benefits such as voice back into the things we do, provide the right tools to make it fit, and get back the true emotion & understanding that even the simplest of minds can’t misread or misinterpret.If I could talk more, and engage with people in my social circles I would.  I can assure you that my comments on this post would have been a hell of a lot more engaging if you could hear my voice, my passion, and the overall emotion in my comments.  And, you could have listened to it on your ride to the office, all while learning a lot more about me and if I’m someone you’d be interested to engage with again in the future.

  23. Steve Hallock

    I did not find this argument compelling at all.  It really does not matter much as a point – sound clearly has a place in modern life – it doesn’t have to be bigger than video just like it doesn’t have to be bigger than eating.  However I thought the logic used had so many holes in it that it was not particularly helpful in making any point.The thesis, as Alex said in the beginning, is somewhat counterintuitive.  There are many reasons why this is so, not the least of which is that video has already beaten sound in many arenas (TV vs Radio for example).  When trying to make a solid, counter-intuitive point, one has to present compelling facts (a couple “oh wow, I never thought of that” or “unbelievable!” facts), and yet there were basically no facts presented in the presentation at all.  The basic argument seems to be that sound is simpler than video and therefore has more potential due to the many properties of simplicity (easy to create, easy to consume, only occupies one sense and leaves others for other stuff, etc).  This is an obvious point that does not actually logically lead to “sound > video” without some other compelling data point to take it there.  Silent movies are simpler than “talkies” and you could even listen to your own music or news while you watch them and not miss much.  But we all know how that battle played out.In almost every use case he mentioned, video is preferred.  Given the choice, people prefer to watch their news.  A sound clip of my baby’s first word?  I send my parents videos of their grandson nearly ever week – a sound clip would seem like a dimension missing.  The WORD is not what’s important, the BABY is what’s important, and sound does not do as good a job at transmitting this.We already could be listening to headphones every second of every day and we don’t.  The data exists on behavioral patterns.  Sound and video have been around for a long time.Clearly sound is great, and Soundcloud is an amazing platform, but I think he might have bit of more than he could chew with this argument.  The thesis could still be right, but the logic not convincing.

    1. wunderkidchaos

      Agreed. I’m sure the argument holds within certain classes of media, e.g. I would always prefer a “Planet Money” audio podcast than video, and I wish he would articulates those exceptions a bit more rather than makes the generalization.Interesting fodder and he did a great job at selling but wasn’t thoroughly convinced of his argument based on the broad generalizations.  

    2. Jose Paul Martin

      Agree with you on this. Video still conveys more than sound. Just because we can parallel process sound, it doesn’t mean it is better. Even if it were simpler, we’d be seeing more crappy stuff being “produced”. Why is that when we want to say a speech, we put in writing before we actually deliver it? Yes, it would be easier to record and re-record, but writing adds clarity that speech alone would not be able to.

  24. ErikSchwartz

    They still need to do a much better job with the radio industry. Some of the best audio producers in the US are getting laid off on a weekly basis. Yet SoundCloud’s awareness amongst radio industry professionals (particularly commercial radio industry professionals) is quite low.

    1. LE

      Typically, from what I’ve seen, Internet companies are more of a “build a better mousetrap and the world will beat a path to your door”. I don’t believe they typically do anywhere near as much outbound persuasion to the usual suspects that they should be doing. Assuming you are correct that “SoundCloud’s awareness amongst radio industry professionals (particularly commercial radio industry professionals) is quite low”.One thing that appears to be missing is an advisory board of industry professionals that can help in many areas including what your comment suggests is lacking.

      1. fredwilson

        soundcloud has a large outreach time, focused on musicians, the music industry, and non-music sound. but erik’s suggestion to focus on radio is a good one. i’ve forwarded it to alex

        1. ErikSchwartz

          I just sent you a note offline following up on this.

          1. David Noël

            We’re working on this, Manolo was a key hire for us to help us out with that.http://twitter.com/manoloOne great example recently is how @danpatterson:disqus is using SoundCloud: traditional radio segments, podcasts and #Occupy recordings:http://soundcloud.com/danpa

          2. ErikSchwartz

            I’ve known Dan for years. I really like his work, in fact I just introduced him to some of my radio industry contacts about a new project he’s working on.

  25. Morgan Schwartz

    It’s a curious argument especially with the backdrop of media history. A more typical thesis would predict a trend towards the adoption of richer and more immersive media experiences:oral tradition > written word > radio > film > television > virtual reality?, gaming? etcHe makes a lot of great points – I actually listened to this talk, playing it in another tab while checking email – but this parallelism is nothing new. How many households in the US have a TV playing ambiently in the background? Our society loves to multi-task but I think it will always opt for/demand a visual option. There is simply too much inertia as a visual culture.His idea that creating an audio recording is 140 times easier than a tweet is only superficially true and doesn’t acknowledge the huge benefits that the limitation of working within 140 characters brings to bear. This limitation forces constraint, focus, and even creativity in our communication. I’m a little nervous about a world that “tweets” in the form of open-ended audio recordings.And on the consumption side – I can visually scan dozens of tweets in the time it would take me to listen to even a brief audio recording.As he mentioned, there are contexts where sound has advantages – driving etc .. but even those may be short-lived as HUD-like displays become integrated in to our windshields and eyeglasses.

    1. LE

      “And on the consumption side – I can visually scan dozens of tweets in the time it would take me to listen to even a brief audio recording.”Exactly. Consumption of information is much greater unless the consumption is for emotion and/or entertainment in which case there is a distinct advantage to having audio obviously. Not to mention that you can also scan written words much quicker and decide if you want to consume further or not. As anyone who has compared browsing in a bookstore and deciding to buy vs. using amazon.com “look inside” to decide whether to buy a book or not.

      1. FAKE GRIMLOCK

        VIDEO BIG WITH KIDS. THEM ONLY ONES WITH TIME TO WATCH IT.

    2. Matthieu Catillon

      “radio > film > television” these are backward steps from written. written is usually much deeper and requires more efforts and accountability

      1. Morgan Schwartz

        whether these are a backward steps is subjective, but in terms of media history, what i meant to write was the “printing press” (as a technology) rather than the written word

    3. ShanaC

      Could sounds be a deeper form of participating in media?I’m not totally sure it is a progression into richer, there stil is this element of hot/cold media that Mcluhan talked about when it comes to mental participation.I’m actually more curious in how rich media games, immersion games (we’ve never really mastered those) will play out.

    4. Matthew Tendler

      Re: 140 times easier. He’s also not counting the effort of speaking. Running a marathon could be 140 times easier than a tweet if we don’t count moving our legs.

  26. Aaron Klein

    To the consumer, I’m not sure I see the difference between sound and video. It still takes a fixed amount of time to consume. Some people are far better learning from sound without the distraction of visuals. (I’m the opposite – very visual and somewhat kinetic learner.)But to the producer, it’s a sea change. Sound is infinitely easier to produce with high quality than video is.That could be SoundCloud’s ace in the hole…

    1. ErikSchwartz

      That was the argument we made at Foneshow. If in a 24/7 media (particularly news) environment where being first is hugely important you can make quality audio in a fraction of the time that it takes to make video of a comparable quality.

    2. Rohan

      I don’t agree, Aaron.I feel more at home listening to an audio book vs watching a TED talk. :)An audio is somehow easier to follow on the move.. or feels so. 

      1. Aaron Klein

        I hear that.It’s fairly easy to turn many videos into audio…don’t look at the screen. But learning style plays a lot into this…auditory learners will always prefer this approach and visual learners will have a hard time “keeping their place” in the material.

        1. Rohan

          ‘I hear that’.. perfect response. You could also have said ‘I see that’.. haha. Very true about learning styles! 

  27. William Mougayar

    Trying to wrap my head around this great title. In theory, the points raised make a lot of sense regarding why this should happen. But I’m curious to learn more about the actual status today of the “web sound” market. It seems that more regular users would need to come on-board in addition to the music artists for that prediction to be true. If we draw a parallel to podcasting vs. video casting, why has the podcasting market levelled off, more or less and video (youtube, vimeo, ustream, etc..) has taken off? A counter-argument to Alex’s is that- the same arguments can be said about video. It has become a lot easier to share a video as well, and one could say- why make a sound recording if you can do a video one and get the added reality factor which sound doesn’t provide. If you’re thinking that sound is competing against video, then I don’t think sound will win. But if you’re thinking that there are new applications of sound where sound could excel as a consumer app, I’d like to see more examples. 

  28. LE

    While I would agree with his statement that “sound is a driver of emotions (movies, songs etc.)” and while I definitely think the effect of sound is underrated in this aspect (I can think of any number of movies and TV shows that have been a success because of sound) I don’t buy into his thoughts otherwise.Sound as the delivery of a message is not particularly efficient (unless you are doing it in parallel of course (like driving or cooking at the same time)) or with the example I give below of a possible service I’d like to see. To deliver his points I had to listen for 15 minutes and there is no question it would have been much much quicker to read the full talk (and I’d have better comprehension) than to have to listen for 15 minutes. Ironically I thought his delivery was bad and it deprecated his message instead of enhanced it. I mean this wasn’t like listening to Ashton Kutcher or Dennis Crowley where there is also an entertainment factor that helps with the message. He didn’t really drive home any points with vocal inflections etc.  The delivery wasn’t interesting. It was bland.  Because of sound. Alex if  you are reading this I would urge you to get coached in this are. I’m sorry if I sound harsh.Sound (as presented in the talk) is a product looking for a market which certainly in no way exists to the extent that Alex thinks it does and will. It’s not a novelty at all and it doesn’t solve a particular problem that people have. And if you want any indication of how the availability of pictures is more compelling just think how many people sit around with the family listening to the radio now (like they did many years ago) instead of TV. It simply doesn’t happen because pictures *and* sound are better. Just like a sunny day is better than a cloudy day (and color has replace b&w as well). Sure people listen to music. But I think most people would rather see a live concert video *and* music instead of just music. This song is great but it’s even better with the addition of video with classic 1974 production quality (you have to catch some of the 70’s visuals like the child with a guitar):http://www.youtube.com/watc…Here’s what I would like to see soundcloud be able to do. I want to be able to (instead of texting) record a message for someone that gets sent like a text.I want that person to be able to reply to the message that I can listen to as sound. In both cases the message will also be transcribed into text so I can read it if I want. So you can have a sound conversation with someone that is non-interruptive in both directions similar and in addition to the way it is done with just text right now. That’s what I want to be able to do that will solve a problem that I believe many people have (when you’re walking or driving it’s hard to text and you don’t always want to have someone pick up the phone and engage in a conversation). So we can call this enhanced text messaging with voice. There are ways to do this now but not a way that is dead simple that I have found (with both the sound delivery and text transcription as part of the service).

    1. fredwilson

      your advice about getting a coach is good. he’s very coachable. english is not his first language and i suspect that’s a factor

  29. andyswan

    I’ll believe it when the porn companies move primarily to the sound medium.Ya…it’s a joke…but they’re a pretty good leading indicator on le web.

    1. jason wright

      Porn built the infrastructure for gaming.What will push the sound wave?

  30. jason wright

    Has Alex looked at how blind and visually impaired people use the web? 

    1. fredwilson

      great question

    2. David Noël

      Our developers have been in touch with visually impaired users for some time now and they together they are testing how we can improve the experience of the site for them. 

      1. jason wright

        That’s a good thing, and all sites should do it. That’s a user interface issue.My point is that Alex’s speculative thesis is an unproven hunch, and therefore a significant investment risk. Sound Cloud’s data is based on a user group that is largely sighted. People who live with audio but without video may be a source of life in sound that helps to prove or disprove or modify the thesis, or at least stimulate new ways of thinking and new lines of enquiry about the thesis.Live for a web day with a blindfold. This is not about how to interface with a website. It’s about how to think about sound without the distraction of sight.Blind people can make and receive phone calls. They have smart phones. They live with microphones and speakers. They think differently. It’s a potential goldmine of data. You only need that one nugget that could be the eureka moment. Worth a shot I think.

        1. Tim Panton

          This cropped up on a weekly conference call/podcast I join ( http://vuc.me ). A blind member commented that if he has to choose between speed and quality of a text-to-speech engine he prefers speed.Conversely in real conversations he prefers wideband.

  31. Conrad Ross Schulman

    This comment is me + Yolanda be cool (le bump) in my headphones.Alex describes how sound-interaction provides us internet users with “good feelings”- aka Music is fuckin awesome! He also says how we make better decisions when were feeling this good from music! (fuck ya!)So if im feelings so damn good…what do i wanna do so damn badly?MAKE A FUCKIN PURCHASE!So put in some kind of online store within soundcloud so us internet users can make great shopping decisions while using soundcloud!! (music/fan shop of some kind…)

  32. Percepied

    Radiolab is, for me, the best example of the power of audio only over video content.  Part conversation – between presenters, between presenter and interviewee – blended with anecdote, quips, interesting subjects, and great audio mixing. It would be a struggle to deliver video content of a similar quality and imagination without a significant budget.As they describe it “Radiolab believes your ears are a portal to another world. Where sound illuminates ideas, and the boundaries blur between science, philosophy, and human experience. Big questions are investigated, tinkered with, and encouraged to grow. Bring your curiosity, and we’ll feed it with possibility.”http://www.radiolab.org/

  33. Eric Leebow

    I agree, this is certainly true. I just listen to some videos.  Videos without sound would be boring.  That’s why I enjoy the radio apps sometimes more than video, you need sound first. I also believe sound is bigger when it’s integrated into photos, and that’s the future of social media, photos connected to sound.  That’s what I’m planning with my social networking site, eventually linking sound clips to photos.  I met someone who was blind recently, and they were able to hear.  I spent about a half an hour with them talking about my site and they were asking me how we’re going to make the site work for the vision impaired, and those who cannot see. My first thought is sound, and that’s before scent, because sound can be mirrored to the Web just as a photo.

  34. ROBOTUNICORN

    1. When Alex says that we cannot watch a video or read ablog post while driving, my first thought is that we need the cars to drivethemselves so that we can consume our content in transit. 2. I would not have focused so intently on this talk if it was only audio.3. Alex mentions that we can consume 3-4x more sound than video.  Thismakes me think about how Twitter has increased my consumption of content ingeneral. This increased consumption has resulted in much less focus on eachpiece of content. Is that a good thing?4. He says that the ease of recording is increasing sound sharing. What happenswhen video becomes even easier to record and share?5. I agree with the link Alex makes between sound and emotion.  But howdoes this apply to the recording and sharing of sound? For the average user,does sound sharing have as many applications as video?  Other than music, what emotionally linked sound am I sharing that doesn’t involve video? I would prefer video + audio of my child’s first words to simply audio.  The image of my child + the sound would evoke more emotion than just sound.6. What constitutes sound literacy?7. I love SoundCloud.8. Why do we build cars that crash?

    1. fredwilson

      a second robot at AVC making so much sense!what is going on here????

    2. ShanaC

      To 8 – we let cars crumple because people really crash, not cars (this comes from a girl who once missed a door and hit the frame from exhaustion when she was in high school

    3. JamesHRH

      ROBOUNI – anyone who has worked in the radio business has sold 1,3,5 and sold around 2.

  35. scott crawford

    I can’t help feeling that Alex is trying to sell me a bigger fire hose. 

    1. fredwilson

      he is!

      1. FAKE GRIMLOCK

        MOST PEOPLE WANT SAME SIZE HOSE, MORE USEFUL WATER.

  36. Ken Haase

    I’ve always liked the idea of using sound more and I think the real challenges are in browsing, searching, and interacting with sound streams or blips.  His argument about parallel processing reminds me of an argument from influential SF editor/author John W. Campbell, made in the late 40s/early 50s (I got it secondhand!) that TV wouldn’t catch on because housewives couldn’t do their chores while watching television (as opposed to listening to the radio).  The commentator (whom I don’t remember) claims that Campbell didn’t anticipate the rise of labor saving devices during the 50s, but I also think that he didn’t appreciate that TV would take attention from print as well as radio.

    1. ErikSchwartz

      TV is often used as radio. 

  37. Steve Ardire

    You want some incredible wisdom on use of music, sound, and tempo from a zen master then watch Jerry Lewis”Method to the Madness’ http://bit.ly/w1Ptgm who was and still is a genius ( and not just comedic )at 85 yo 

  38. Tim Panton

    That’s weird, a 15 minute video and 73 comments on user generated sound and _nobody_ mentioned telephony. Alex even waved a phone around and called it a computer with a microphone!I guess this is a failing on the part of the telco world that no-one thinks they can innovate.The nearest thing they have to SoundCloud is VoiceMail and we all _hate_ that.Sound has two weaknesses as compared to text: it is hard to index and it is slower toconsume. We will have to overcome these problems before we can give the web a voice.

    1. fredwilson

      but voice mail is not open and part of the webwhen it is, it might be hated less

      1. Jake Ludington

        I currently convert my voicemail to text and have it sent to my inbox where it becomes actionable. I can process the info faster as text and the phone number becomes a link I can use to return the call. It seems unlikely audio alone can deliver on that efficiency because it would need to integrate with some kind of smarts that strip all the audible junk that comes with human processing of thought to language. It would further need a visual identifier I could use to indicate that the number at second 23 is the segment of audio I want to interact with.

        1. fredwilson

          i do the same thing. was one of the first customers of phonetag and i can’t live without it

      2. ShanaC

        I’m not sure I want my voice mails and text messages open to the web….

        1. fredwilson

          that’s not what i was suggesting. think twitter and twitter DM. or a public or private video on vimeo

          1. ShanaC

            I need to think about that more. One of the interesting things about the web is that you open up for certain kinds of person mistakes that can be more public because of copy/paste functions even if they are more technologically secure because of the cloud.Even you have had Dm mistakes. Voicemail mistakes could be worse….

    2. alan

      I don’t believe that audio will be bigger than video when it comes to consumer facing platforms like Soundcloud, Cinch.fm, Audioboo, etc. Discovering and consuming audio content, particularily UGC conversational content will struggle to scale. I do think it becomes bigger when companies, organizations, brands, etc turn their phones into publishing platforms. There is plenty of disruption waiting to happen, and its going to happen on the enterprise side.

  39. andyidsinga

    I wish soundcloud offered ‘sign in with twitter’ ( https://dev.twitter.com/doc… )i’ve started to use that for most new services/apps when available.

  40. LE

    One of the opportunities here is certainly repurposing existing content to audio only. Along those lines soundcloud.com should partner with people like this:http://mixergy.com/about/and this:http://www.founderbuzz.com/…For content specific to business interests as only one example.The same approach can then be extended to the non tech work, for example CME (medical education) with organizations such as this (and there are hundreds in just the medical vertical):http://www.hospitalmedicine……as well as other online learning that can be converted and accessed as audio only.”Online leaning…powered by Soundcloud” (has the viral effect as well anywhere offered).Note: the pictures that are showing are from another post. Not sure why they are attached here and there is no way to delete (disqus?) on an edit.

    1. fredwilson

      yup

  41. Alex Nesbitt

    Audio is already more ubiquitous than video.   Virtually all media containers contain the capability of carrying an audio file and a subset allow for both a video and an audio file.  As a result the number of files with audio vastly outnumber the number of files with video.  So by that measure, audio is already bigger than video.But my sense was that Alex was pushing the position that audio alone will be bigger than video combined with audio, which IMO is not likely.  The combination of video and audio is an experience that neither video alone, nor audio alone can compete with.

    1. fredwilson

      you are right that he was pushing audio alone, not audio as the soundtrack to video

    2. MissTrade

      @wesleystace aka John Wesley Harding also said same thing last week on the economist. Look for a great video that says similarly.

  42. george

    The vision that sound will leapfrog video? Not clearly presented but I do believe Alex made some strong points. If you look at new features developing in smartphones, car handsfree and perhaps future TV’s, sound will play a more significant role (Siri, Majel), which builds on the premise that “Life is better in Parallel.”Truth be told, “simplicity” does correlate to success and adoption of products and applications. Eliminating clicks or steps to get to the point (desired use) is extremely important and I do believe sound will make a breakthrough in this area.Simplicity is really the key; just look at the 3D market, it’s having a difficult time making an imprint on consumers and it’s not because 3D isn’t a better visual experience but because it’s messy and complicated – glasses, wires-sync issues, weird looking recorders!Sound on:thumbs up

  43. Kevin Pillow

    Big Brother is no longer watching you, they are listening to you instead…..

    1. Rohan

      Hehe

  44. sigmaalgebra

    Ljung expects sounds to become ‘big’, “bigger than video”.Generally sounds are (1) topical and (2) not topical, that is, ‘classic’. For (1) topical sounds, they can become popular for some weeks or months but tend to become of low interest quickly. Altogether, (1) topical sounds can’t do much to make sound “big”. New sounds that are (2) classic have to compete with the best of the past and, thus, are difficult to create. So new classic sounds can make sounds “big” only slowly.Now recording 100 billion sound ‘tracks’ and uploading them to a server is quite feasible, but, with some exceptions, generally finding a desired sound in the 100 billion is challenging. So due to this challenge, generally it is difficult to index, sort, compare, search for, discover, or recommend sound tracks, and, until circumvented, this difficulty will severely throttle how ‘big’ sound can get on the Internet.In one step more detail, I partition sound into (1) speech, (2) music, and (3) the rest.For (1) speech, we can get the text, hopefully from a human typing but maybe from speech recognition software, and sort, index, extract keywords and key phrases, and, thus, make some progress on search for such sound.For (2) music, if we can get some corresponding text, say, on composer and performer, then we can do some indexing, …, searching but encounter some severe problems ‘getting at’ the crucial ‘artistic meaning’.For (3) the rest, we are mostly stuck.As at http://www.avc.com/a_vc/201…from Friday, I like music a LOT. E.g., to pick up a violin for the first time in grad school and, with too little talent and practice, eventually make it through even a little of the most important violin music means REALLY like music.It is possible for a bright person with a good background in music (e.g., my wife) to conclude: “Music doesn’t mean anything”. Further, L. Bernstein wrote a book about music where he explained the difficulty, near impossibility, of saying in words what is said in music. Or, we say it in music when we can’t say it in words. So, without words, it’s tough for music to ‘mean’ much.But, it’s also tough to say it in music: E.g., in the movie ‘The Shawshank Redemption’, one of the greatest contrasts was the music “Sull’aria” from Mozart’s ‘Le Nozze di Figaro’ as at http://www.youtube.com/watc…No joke. Tough to put it into words although the Morgan Freeman character made progress.But gotta consider the source — Mozart. Not easy to write music as well as Mozart.Yes, a common definition of art is ‘communication, interpretation of human experience, emotion’, and such Mozart is a good example. So also tough to put such emotions into words.Net, for music, tough to get a lot of good music or, at least, a lot more good music than we already have.But, video and music do not always conflict.E.g., as I type this I have playing http://www.youtube.com/watc…which is an Otto Klemperer performance of Wagner’s prelude to ‘Lohengrin’. The YouTube posting has a series of curiously fitting still images from several Pre-Raphaelite painters. For some of the women pictured, got any phone numbers, e-mail addresses, even Twitter usernames?E.g., last week I got on DVD from Amazon the enigmatic, haunting, magical movie ‘Portrait of Jennie’. Part of the magic is the music from Debussy, and more is from one sketch and one painting.E.g., at http://www.youtube.com/watc…is the duet “Presentation of the Rose” from the Richard Strauss ‘Der Rosenkavalier’and it seems that the video of the opera helps the music. Likely at the senior prom, Bonney was a very pretty girl!But as tough as it is to write Mozart, Wagner, or R. Strauss, a good combination of video and music is still more difficult. Indeed, apparently opera recordings of just the music are much more common than videos of the whole performances.Or, not a secret, making a good movie is both difficult and expensive.So, to summarize, there’re (1) good music and (2) good combinations of video and music, and (2) can be better than (1), but both are difficult to create with the second usually much more expensive. Moreover, and one reason still to listen to Mozart, good, new instances of either (1) or (2) arrive slowly since it’s tough to improve on the best of the past.For the Internet, music, video, and art more generally present some biggie challenges: Can’t easily use computers to index, sort, or compare art, music, videos with music, or still images.Back to ‘meaning’: As Bernstein explained, tough to put such meaning into words. Even when meaning is in words, tough to have computers ‘get at’ the meaning. Keywords and key phrases make just hash out of meaning. So, tough for computers to ‘get at’ meaning even for art as literature in words.The future of the Internet seems destined to be heavily for content ‘produced’ by amateur users. There we stand to see more cat antics along with videos on how to cook Moo Shi Pork, how to replace a car radiator or a kitchen faucet (not in the same video!). Since the purpose of such video content can often be described well enough with words, search based on matching keywords and key phrases can work; that is, we will be able to find such content when we want it. For such videos, the production quality and costs remain low, and the video is much more effective for the main purpose than just sound. So for such content, I see video as being “bigger” than just sound.Until have some progress on computers ‘getting at’ the ‘meaning’ of art, tough to have good search, discovery, or recommendation of art, sound, video, or even text, and, thus, tough to have either become ‘bigger’ very fast.Net, what is throttling much of sound and video content on the Internet are the challenges of creating content with high quality and, then, finding such content.

    1. fredwilson

      curation is the big challengebut people like to curate soundhttp://fredwilson.fm/

      1. sigmaalgebra

        I said search, discovery, and recommendation but to be more brief omitted the necessarily closely related ‘curation’.For curating sound, likely people are doing something like they did early in the history of the commercial Internet when a person had a Web page with a list of their favorite links. E.g., my post ‘curated’ some of my favorite sounds!So, people will have lists of their favorite sounds.People are back to such ‘curation’ now even for just text: E.g., see the right column of Brad Feld’s blog page.Why? Because for such curation current search engines are not better than such manual methods!How to proceed? One way is to say that Virgina is ‘similar’ to Mary and, then, let Mary see Virginia’s ‘curation’. There can be some utility here, but my view is that the utility is meager and we need to do much better.A core issue is the ‘meaning’, and we need computer means to ‘get at’ that. E.g., how to do a search query to find what is like, in the sense of ‘meaning’, the links on Brad’s page! We can’t yet hope to have computers work with ‘meaning’ much like humans do, but the practical need is just to do well enough ‘handling’ meaning to do well ‘automating’ the desired ‘curation’, and there are ways to do that with some ways better than others! Of course, and especially for sounds, we can’t approach ‘meaning’ just via text, keywords, and key phrases!So I would have in mind, what? Right: Some math. Will see this math in a Stanford CS course? Not likely! The prerequisites? Nope, not in CS but elsewhere at Stanford! In a community college? Nope!

      2. Steve Ardire

        Fred says ‘but people like to curate sound’evidently so i.e.Mobile App Measures How Good You Are In Bed via @PSFK StumbleUpon http://bit.ly/vcgTgHA Swedish ad agency came up with an innovative way to promote safe sex among young adults. Ester handed out 50,000 condoms that were embedded with a unique QR code. When the barcode is scanned, it installs an app on your smartphone that can measure factors of your love making session such as sound, duration and rhythm. Each time the app is initiated, it reminds the user to put on a condom first before beginning a new session.

        1. fredwilson

          what will they think of next?

  45. Joe Lazarus

    I’m fairly certain that sound will be bigger than video. It may already be bigger. If you could add up all the time people spend listening to audio while at their computer and through their mobile devices, I wouldn’t be surprised if that number is already higher than time spent consuming video. Granted a lot of that sound is not being transmitted via the web yet (it’s often stored locally in iTunes, etc), but the act of consuming audio from our digital devices is probably already bigger than video. Soon, I think much of that same sound consumption (mostly professional music & talk radio), will move online through streaming services like Pandora, Spotify & SoundCloud. That said, I think there are a number of things that need to happen in order for sound creation to become mainstream… if it ever happens. Photos and video are easy to capture and fairly interesting to view at least to the small group of friends and family mainstream people share them with. Producing interesting audio content is much harder to do. I’ve tried to use SoundCloud’s iPhone app to record bits of my life, but I find it hard to come up with much worth sharing. Regular day-to-day events are filled with lots of “dead air” that’s dull to listen to. Other audio-rich moments like being at a live concert are often difficult to capture with decent sound quality. I suspect that sound will outgrow video, but that much of it will be professional or semi-pro stuff. One notable exception where I do see big potential for mainstream audio creation is in the area of private or semi-private archiving of sound. If I could record meetings and calls at work, for example, easily upload them to a secure website, if the speech from those files was converted to text, and all of my clips were archived and searchable so that I could jump right to the moment in a conversation about a particular topic I want to listen back in on, I would use that every day. I imagine we’re close to a point where speech recognition technology is good enough to make this sort of thing possible. 

    1. ShanaC

      Yes to semi private.  There are times when I want to remember moods.  There are times where I have to retranscribe stuff.  These are huge uses for me, alas they cap.That is something I would pay for when needed.  per minute recording.

  46. William Gadea

    I think it’s a pretty weak argument frankly. Is sound really easier to capture than video? Not really. Can sound be consumed in parallel easier? Sure, but that’s a double-edged sword. Is it a badge of honor to be the media to use when people are only half paying attention?Video’s strength is that it’s an immersive experience. Since the audio-visual media were invented, people have had a strong appetite for that experience. New and exciting ways to create and deliver audio will develop, but there’s no need to set an unrealistic benchmark for comparison.

  47. D.B.

    Alex’s speech is a good test case to see if we can have sound only. While listening to his speech, I tried NOT to look at the video as a way to demonstrate his arguments. It was almost impossible… either I kept coming my eyes back to the video, or I lost concentration. I wonder if it’s only me…I think Alex is trying to “reinvent” Sound. I’m not sure if Sound can go to places where it is not today, as he is suggesting. If Alex was right, people would communicate much more over voicemail than SMS/e-mail. That’s not the case… 

    1. ShanaC

      I don’t think it was the speech’s material’s fault as much as intonation.  Sound shows lots of weaknesses as well as strengths of the material it shapes.

  48. Authy

    Incidentaly I heard this video while coding ….. video was boring, sound was interesting.

  49. jason wright

    1960. Radio, Nixon won. Television, Kennedy won. 

  50. tyronerubin

    @fredwilson:disqus just watched now.Watched most of LeWeb funny how I missed this one.You must know how much this talk means to me and what I am trying.thanks! and thanks for helping me feel a part of the sound content creation.

  51. ShanaC

    I’m finding the one thing he missed was the one thing I love about sound.Sound has this fluid quality to it.  Not just time fluid, but another kind of fluid, it can be both a very passive experience and one that is highly engaging.I’m finding that over time I am using soundcloud (in very limited ways) the way I would use a radio I could record with.  I track musicians (and other sounds).  I search for them, and then I record my moments as well.  I’m finding that I want to list and remember sounds for some other quality, something ethereal, that would be silly to catch on video.  I sort of wish that when it came to variety (especially mainstream artist variety) I could use soundcloud over youtube.  Youtube still has a larger variety of stuff, even though it isn’t nearly as lightweightI just wish they had a better search process.  I recognize that is one of the most difficult parts of sounds.  They are incredibly hard to describe.

  52. another cultural landslide

    Forgive us for being somewhat of a contrarian’s contrarian – but while we do agree that sound will be much bigger than video, it’s for very different reasons.We see Alex’s basic arguments as things that describe The Here & Now, moving only slightly into the future – while ignoring certain market forces that will change these things radically five years down the road. (For the following examples, we’re only using U.S. market forces – as far as the rest of the world goes, Your Milage May Vary.)Two of these forces: • Bandwidth Caps: We’re already seeing these in effect on mobile; and with the limited competition in land-line access, many of the providers are desperately trying to find ways of moving into “metering” access (TimeWarner & Comcast are prime movers in this arena) as a way to protect their core businesses, plus create new streams of revenue; as these caps take hold, suddenly streaming HD video will start impacting the consumer pretty hard – but since audio uses a fraction of video’s bandwidth, caps shouldn’t impact streaming music users. In other words, video consumption via streaming will decrease, while audio should remain the same…• Music Streaming Rights: What will impact streaming music users are the onerous rates streaming services are being charged by SoundExchange & the three major labels; this almost guarantees these services’ failure in the future – especially around 2015, when the current agreements with SoundExchange expire. At that point, SoundExchange will most certainly seriously jack the rates. Hard. (They’ve already pretty much said they will.) This is the greatest problem streaming services face, and it’s one we can’t see them beating.At the same time, non-label music availability should increase greatly, due to the rapidly growing number of musicians who discover the new models of distribution available to them – and as long as they also see that these models also require them to think differently about their music, they should do okay. (But that’s a whole ‘nuther post…)Both these market forces bode well for sites like SoundCloud.BTW, as musicians, we have one pet peeve about SoundCloud – the damned “waveform” display. For musicians, this kinda gives away the expected dynamic surprises in the music ahead: by looking at the waveform, you can expect the music’s about to build; or if you see one monsterous brick of a waveform, you know it’s gonna be LOUD.  😉  k & w

    1. FAKE GRIMLOCK

      METERED INTERNET LAST EXACTLY AS LONG AS IT TAKE GOOGLE TO SWITCH ON ALL THAT DARK CABLE IT BUY UP.THAT WHOLE POINT OF BUY, TO HOLD AGAINST HEAD OF CARRIERS WITH FINGER ON TRIGGER.

      1. ErikSchwartz

        Metered internet is inevitable because that’s the way costs associated with delivering internet scale. Usage is growing faster than cost of delivering is coming down (especially if Netflix, Hulu, HBO Go et al continue to thrive.PS I hope I do not get eaten.

        1. FAKE GRIMLOCK

          ME THINK MARKET FORCES PREVENT IT.TIERED MORE LIKELY. GET X SPEED FOR Y MONEYS.CONSUMER HAPPIER WITH HAVE AS MUCH AS WANT AT SLOWER SPEED (WHICH SAME AS GET LESS TOTAL) THAN HAVE TO DEAL WITH FEAR OF SURPRISE BILL.

          1. ErikSchwartz

            I don’t think the bandwidth tiers will solve the problem of contention in the last mile. Data transfer volume tiers would work better (but still don’t really address the issues).

      2. another cultural landslide

        While I totally agree the Giant RoboDino, never underestimate the stupidity of people who work in Big Television.

  53. Sandy

    Is this available as an audio only podcast?  </irony>

    1. fredwilson

      i think so. alex was working on that last week.

  54. Mike

    I have been using Soundcloud as a poor man’s youtube.  I partake when I want a song/speech/lecture without the added bandwidth costs associated with video.  Why not play with the idea of restriction to foster creativity and expression?  Why is there no sound option on services like Twitter & Bnter (e.g. give users ten seconds of sound per tweet/statement, then play through the path of conversations)?  Also, I’d love for this blog to have an audio component so I could play back both Fred’s ideas and the subsequent paths of commentary.Whoops, I just saw the qwips.com site.

  55. Peteris

    Sound sucks as a medium for message delivery – it does pack a bit more nuances and has a greater emotional effect, but it has so much time overhead!  It takes 30 minutes to hear something that can be read in 5 minutes – this means that I think thrice before listening to podcasts and such if there are no transcripts available.A serious painpoint for someone to sell me good speech-to-text that integrates with any random A/V content that I can find on web – make a browser extension that can give me a transcript for a youtube video such as this one, and I’d pay for it.

  56. Imobiliare Bucuresti

    nice one

  57. Dan Patterson

    Advantage Audio:- My time to consume audio is exponentially greater than the time I have to consume video.- I can multitask while consuming audio.- Audio engages the imagination in ways video simply cannot.- The mind absorbs and retains information from audio better than video.#twocents from a life-long technologist and broadcast journalist.

    1. fredwilson

      you and alex are on the same page

      1. Dan Patterson

        Good, we should be 😉

  58. ahumancapitalist

    I am very ready for an audio Summify that spies on my clicks, picks podcasts/talks relevant to my interests, and downloads them when I’m on a Wi-Fi connection.  Probably a slider on amount of desired content to match variances in available parallel processing time (e.g. 15 minute vs 1 hour commute).  For audio Summify to be as good as Summify, it probably needs an audio Twitter.  Doesn’t need as many users as Twitter b/c all you need is a sufficient number to do the crosswalk between text clicks and audio likes.  iTunes, Spotify, and Pandora don’t fill this gap.  SoundCloud is tackling discovery from the right angle to be this player.  My Q is whether their product roadmap has them sticking to music.  Still a great product if so, but not the product I need to get the product I want!  The reason I want the Summify and not the intermediate product in the audio space is: 1) While I love text-based discovery, I want my audio curated, and 2) When I do audio discovery, I’m often out of the house with insufficient internet speed to download an hour-long NPR podcast.

  59. JamesHRH

    I just scanned down the comments to see who was commenting and what they were saying………….there’s a clue obviously.Gold Medal – ScanningSilver Medal – ReadingDual Bronze Medals – Viewing video & Listening to audioVideo is totally immersive ( I can’t even have people talk while I watch something I am really in to ).Audio is nice to have around you ( nice is never good, even if it is everywhere ).Man, I must have to edit / reformat every second Disqus comment -what is with that, anyone?

  60. Ryan Tanaka

    Well, hope it turns out to be true — as a musician, I’m always looking for alternate models to get my stuff out there.Most people here are probably familiar with the practice of people posting albums of music on YouTube now…the visuals (for the most part) are usually uninteresting but people use the network because of its easy indexing and access.  The web needs a sound version of this at the moment…SoundCloud gets very close, but it needs to rank better on search engines, imo.Right now I’m experimenting with the idea of posting all my music directly there and trying to get revenue through their Adsense program.  Making a real “music video” is hard and expensive, and it doesn’t let me post as many things if I go that route — sometimes I just plug in my phone into my car and listen to music off of Youtube on the go, so I’m hoping that people might do this too.  It’s kind of a waste of resources in some ways I guess, but the ease of access makes it worth it, I think.  Musicians have been fighting against the “visual” music industry for a long time now.  I live in LA so I know how looks can often override what the music sounds like, a practice that I think has backfired on them in the long run.  But it’s a cultural thing at this point, so it’s largely up to the consumers if they want to move onto something different.  In some ways, maybe that’s what the speaker is talking about too.One interesting thing about music in the digital age is that despite being “freed” from technological limitations, both listeners and musicians still continue to find the “album” format somewhat reassuring.  Is that going to change, you think?  I assumed that that it was going to be outdated by now so I avoided making any up until this point, but my latest project looks like it’s going to turn into an online version of one.

    1. fredwilson

      my teenage son’s friends always post their songs as videos on youtube and rarely as sound only tracks on soundcloud or other audio sharing platforms. i think it is because they grew up with “video music”. i think they are doing themselves a disservice because so much of the reaction ends up being to the video instead of to the music

      1. Ryan Tanaka

        Yes I agree. I think that Alex may be onto something, but he has a long road ahead of him because a lot of these practices are culturally ingrained rather than a product of pure function. The “album” thing is proof that as human beings we tend to be very ritualistic, even in the digital age.Did you know that in Hollywood, bands practice in front if the mirror to see how they look while rockin’ out? That might explain a few things why things are the way they are now.

        1. jason wright

          That could be their ‘stage craft’ choreographic preparation. However, so many successful bands seem to miss the stage stage and move straight to the music video stage as ‘studio engineered’ product.

  61. gregorylent

    i want a comment on twitter taking $300 million from saudi arabia http://www.bloomberg.com/ne…what will be the effect on twitter’s use for mid-east (r)evolutions? tone-deaf decision, thinks me.

    1. fredwilson

      saw it. anyone who invests in twitter nowadays doesn’t have any control or governance on the company. its pretty much the same thing as buying stock in a public company.

  62. heuristocrat

    I love that Alex and USV are doing something different and trying to push the envelope with sound. But it does feel like pushing.Few if any would argue that sound is important to our emotions and how we perceive things. Beyond the obvious example of music good film directors know how much the soundtrack influences how we “see” the film.It’s hard to have a good argument or discussion here because the definition of “big” is hard to fathom. If it’s just about how much content there will be, yes sound will be big and maybe bigger than video if it’s easier to create. But since this blog is run by a VC who is in business to generate investment returns you wonder what “big” means in terms of a business model. YouTube is able to use an advertising model and generate substantial revenue and profit. That’s harder to see in the sound space. For example there’s The Conversations Network, all audio initiative that records and produces the audio from many conferences. I find it hard to listen to talks but much easier to watch a video from Ustream, especially if there are good slides. The Conversations Network ultimately had to find a way to include the slides and sync them to the audio for their listeners to really enjoy the content.You can buy “stock sounds” now at places like iStockphoto.com. We’ve played with using sound in our own businesses but our clients (investors) typically want to consume information *faster* which means a picture and a few words wins over sound by a mile. (The other way to think about parallel processing is that it’s about speed and efficiency and sound is pretty slow.)Of course sound is “big” but how does that translate into business? We’ve done some work for a cool company called JamHub (www.jamhub.com)  that solves a problem bands have but that a pretty specific example. SoundCloud has been been a cool company for years and Alex is a terrific leader. I hope they continue to grow and find ways to create a business around what they do.

  63. Esayas Gebremedhin

    movie makers know that sound is the master and image the slave. alex is predicting a future he wants to create. i like and use the same strategy.but if i was in charge to look at it from distance, two facts are challenging for what alex is proposing.1. our life is more handicapped if we were blind than if we were deaf2. video is image multiplied 25 times per secondthe challenge for sound (cloud) is nature and i would like to believe that the un-muting process is more a symbiosis between eye/ image/ moving image AND ear/ sound/ music.

  64. Ciaran

    Brilliantly meaningless from start to finish. To start with, sound probably is already bigger than video if you add up the amount of time that people spend listening to radio, iPod, etc.. compared to that spent watching audio-video content.Then there’s all sorts of lovely little pieces of start-up fortune-cookie philosophy:* Adding a record button to your app doesn’t make it easy for people to create things, it makes it easy for them to record things. Recording something doesn’t really make you a creator.* Yes, it’s bad to watch video and drive. That’s why we have radio. And?* Yep – sound means more than just music. Podcasts, ebooks. Again, and?* We’re all going to become sound-literate? This is literally gobble-di-gook.Obviously a very clever guy, and I’m sure very nice, but seriously, you might as well have a talk called black is bigger than blue. Or, more accurately, fish is bigger than balloon.If this is just, as I suspect, a long way of saying ‘Soundcloud is going to be worth a fortune and our valuation is, like, way too low’ (with a Swedish accent), then I’ll bow to the superior investment skills of many of the other commenters here, but simply suggest that we all take a minute to examine the history of the last 20 years of the recorded content industries, both on & offline.

  65. Michael Dillhyon

    For inspiration, I cued up “Video Killed theRadio Star” on SC while reading this post.Fascinating dialogue about how the world will want toeat rich content. Measuring the market potential of each form of mediais relative to the reference point of the consumer…with a heavy weighting onthe ease of curation of the content.With that in mind I would posit that the majoritywant to be immersed in the content some of the time (i.e. movie theater), but withthe oceans of interesting content available…the majority prefer to scan mostof the time. The video versus audio debate then hinges on how toartfully solve for the latter.Now for my shameless plug on behalf of audio. I have a stakein a UK-based startup: Call Trunk. A source of lively discussion as the corecloud-based technology records phone calls on just about any device. The APIintegrates easily with other cloud services for almost ubiquitous storage andmanagement capabilities. Fairly impressive telephony platform, but the reallyintriguing story about this little firm is what is around the corner. Nailingthe phone voice (a different animal than other forms of audio) as searchabledata (versus the fairly useless signal) interface will be a game changer…IMOovercoming any of the misplaced latent negativity with the obvious goodness ofsuch a useful capability.

  66. Rileyhar

    A compelling presentation that unfortunately was somewhat marred with distracting” Ums”  throughout the presentation.  If one is going to promote the power of audio one should get the audio correct.Riley

  67. drew hansen

    FounderLY recently interviewed Alex about SoundCloud. in addition to his belief about why audio will be bigger than video, he shares his background and passion for music & tech and how it led him to where he is today. i posted it on forbes:http://www.forbes.com/sites…i’m interested to see where soundcloud goes with their high-end, professional tools, which he alludes to in the interview.

  68. Geoffrey Weg

    I completely agree with Alex’s argument about the advantage of being able to do other activities in parallel while listening to sound. I listen to a ton of podcasts while running (ThisWeekIn, NPR, etc.), and NPR on the radio while showering, dressing, cleaning, organizing, etc. (despite driving my fiancé crazy!). A lot of what I listen to is actually the audio of a video. And I rarely find time to sit down and watch these videos. 

  69. Andrew Dubber

    Sound is really important to me. I sort of wish Alex was right about this so I could go “yay sound!” – but he isn’t.He isn’t in part because the assertion itself is meaningless (define ‘bigger’ – and is that necessarily a good thing?) – and partially because it’s an attempt to predict the future (a sure-fire method of being wrong) – but mostly because it misunderstands media, communication and the ways in which human beings construct and interpret meaning.He might as well have said that live theatre will be bigger than cinema because the trend is toward more accurate 3D experiences.I can understand why he would make this sort of assertion, of course. As a short-lived PR grab, it’s a bit ham-fisted, but it’ll score a few column inches in a few tech publications. But as a polemic and a piece of rhetoric, it’s entirely flawed. He’s started with a conclusion (effectively a bumper sticker slogan rather than an insight) and has gone looking for supporting evidence for that conclusion, rather than starting with the evidence and reflecting upon what that evidence points to in aggregate. And you’d end up with something way more interesting, useful and, above all, profitable if you did that slightly more difficult, but ultimately more meaningful task.Thing is, sound doesn’t need to be ‘bigger’ than anything in order to be important. However, sound plays an important role across a wide range of applications in which Soundcloud could be integral, but which it chooses to ignore. Both radio storytelling and disability have been mentioned in this thread already, but there are far more overlooked opportunities here.I’m a fan of Soundcloud and what it does (and I’m especially a fan of its people and their passion for the product) – but this is the kind of “In the future we will all…” nonsense that usually only the likes of Gerd Leonhard trucks out. And that’s kind of disappointing.

    1. jason wright

      “…but there are far more overlooked opportunities here.”Do tell.

      1. Andrew Dubber

        An afternoon of brainstorming would bring up a wealth of them, but off the top of my head… Oral history’s a big one. Nobody’s done that well online yet – particularly in a way that links things together in a way that could provide a resource for future documentarians. Imagine I recorded my father about living through the second world war as a child in Glasgow, with his father away fighting for the first five years of his life, living in relative poverty before moving to New Zealand at the age of 13 with his family… and so on. Now imagine I could tag, rather than simply comment, on the timeline. Then imagine someone wanted to search for Creative Commons content to make a programme about emigration. Or childhood. Or post-war Britain. Or short story writers (that’s what he does now). Suddenly we have an incredible tool for finding an incredible range of material that can paint a picture in sound (or sound with images) of who we are as a civilisation, through any frame we might later choose.A central repository for the world’s radio archives would not be a stupid idea either – especially if it was similarly searchable and research-able.What about a project to crowdsource every public domain book as an audiobook?Recordings of phrases and words in every language and dialect in the world, properly annotated, would give the basis for the development of a true Babelfish – either through natural language modelling or drawing on a database of actual recorded phrases – so that I can have a Skype conversation with someone anywhere in the world – and we can speak our own language, and hear our own language, despite the fact that neither of us would understand a word the other actually spoke if face to face.Like I say – these are just off the top of my head and need proper thinking – but the power of sound, of speech, of environmental sound, of music is a massively under-explored area of online media.”How can we be a bit like radio?” and “How can we beat YouTube?” seem like the wrong questions to be asking.

  70. Kalan

    He’s not comparing sound and video. He’s comparing sound, and video with audio.  If it was a choice between sound and muted video, he might have a case.But video sites already include sound.I think it is difficult to argue that sound alone is better at communicating and conveying emotion than video and audio combined.

  71. zzpreneur

    Great insight. I “LISTENED” to the speech this morning on my run. One thing that really stood out to me was the attachment of “EXPERIENCES” and sound.

  72. Prokofy

    Yeah, most people want a soundtrack for their life, not to watch some silly overstuffed music video choreographing. Have you noticed how really over-the-top they have been getting? (Rhianna, Lady GaGa)

  73. Karan

    Not sure if I entirely agree with what Alex says, and here is why:He says: “Sound will be bigger than video” – He seems to imply that becuase it is easier to click a button and record a thought, sound bites will replace tweets!Most sound bytes are not worth listening to unless they are refined.  I am afraid most sound bytes will be simply brain farts.Furthermore, a tweet with a link offers dimensions that a sound byte can’t – Additionally pictures – and that is what 140 character tweets are – register and are processed differently than sound.  I can process tens of tweets in a flash, can’t do the same with sound.Alex also seems to think that becuase recording is simple – creating and sharing will be more prolific – Do we really need more mindless content!!For mindful well produced content there is iTunes and other platforms that let you listen to great music, lectures, dialog – contemplate and let your immagination run wild!!!

  74. Javier Hernandez

    I am more inclined to think that the third point is what triggers the value of Soundcloud. Sound. The same as YouTube made us discover all different types of videos, SoundCloud I suppose will faciliate access to more content that wasn’t easy to upload on YouTube. A new tool that deprecates the old tool. Like online did with print, but in media.

  75. jeffyablon

    The problem with A/V is yo can only digest it linearly, at the pace it was designed to be viewed/listed toWords, on the other hand, can be skimmed.The argument a bunch of others have made that you can put audio in the background, whereas video requires you to single-task, is SPOT ON.Video, let’s face it, brings a bunch of problems. Other than as entertainment or to have a conversation, it’s … BAD

    1. jason wright

      This is a very good point. I was listening to audio recordings of Fred’s MBA Monday posts yesterday. They’re a good alternative way of accessing Fred’s ideas and information, but I’m being pulled along by the the linearity of the sound medium and at the pace of the voice. Stopping, starting, rewinding,…it’s not like text.    

  76. Dino Dogan

    Audio will be bigger than video because it’s simpler. And he used Twitter to prove his point.I submit to you that twitter is popular for more reasons than its simplicity.Also, he is examining only one side of the equation. The makers of sound. What about the consumers of sound?I think a better analogy for this is TV and Radio.Once TV came along, Radio lost. This is a more direct analogy than Twitter, at least.Also, TV and Radio have a history we can look back on, and if history is any indicator (and it usually is), audio will have its place, but video will rule the day. I could list bunch of psychological reasons why this is likely going to happen but DisqUs is cutting me off already 🙂

  77. jason wright

    If Alex is wrong what happens to SoundCloud? How critical is his thesis to the company’s future?

  78. johnfurst

    Listening to the audio of the video right now. (Player in the background.)I am not sure what he means when he says, “audio will be bigger” then video. What is bigger, he did not define that for me. It’s certainly not volume of internet traffic. Video will consume more bandwidth and video bitrates are still growing.He might refer to usage: Getting more users, more average usage time, …. something along that line. And I certainly get his point. “Audio is much easier to create and even more easy to consume.”However, it’s still a prediction that needs to come true.Remember:** A picture says more than 1,000 words. (Moving pictures even more.)** No matter what, some information needs to be read in order to be comprehensible.** Also remember that most online videos (those created by amateurs) have terrible, aweful sound quality.I close with hoping that BIGGER doesn’t mean more just one time entertaining (but rather meaningless) cat videos. Sorry, I mean cat audios. As far as infor products are concerned I certainly welcome more and better audio versions.YoursJohn

  79. mackcolin

    We had sound AND speech way before Gutenberg.  The introduction of print enabled the development of literacy which spawned higher level thinking.  To suggest that speech will supplant anything is really silly.  It is so easy to take our own communication skills – reading, writing, speaking – for granted in the interest of promoting your own business interests – especially among the literate, educated, wealthy intelligentsia.  Yes, the Orson Wells 30s radio program is a compelling example of the argument.  But technology, MULTI media’s influence, and the Internet are teaching us new things about the powerful global spread of information without the need, necessarily, to make money at it.

  80. William Mougayar

    I guess you were right…given SoundCloud’s last round of $50 million financing at $200 mil valuation http://www.digitalmusicnews…

    1. fredwilson

      Numbers are wrong but the rest is right

  81. ErikSchwartz

    At some point the signal needs to start moving the molecules. That’s where the sound quality of most phones breaks down.

  82. Rohan

    Ah. Heard that request from you before.

  83. ErikSchwartz

    They’d sell to a certain market, but I can’t see it not being niche.

  84. ShanaC

    I’d pay for that.  I think that would do well, as it would be more like a minicomputer rather than a phone.  phones feel more disposable.