Check In Apps vs Audience Measurement Panels

I spent a decade as an early stage investor and board member of comScore. In that capacity I learnt a lot about using representative samples to develop broad based market research on media viewing. This is the approach used by Nielsen in television and comScore in internet. Both companies assemble representative panels of users and then scale up the data to predict media viewing at large. There is a lot of data science involved in this approach. Both companies have built large and valuable franchises with this technique.

Yesterday, I came across a blog post by our portfolio company Get Glue where they correlated their "checkin data" with film box office results. For those that don't know, Get Glue is the most popular entertainment checkin app for mobile and web. Get Glue has well over a million users and had over 4 million entertainment checkins in April. So the question is – can a checkin app be a representative sample for the purposes of measuring and predicting entertainment product performance?

Here's a chart of checkins vs. box office results:

Glue_movies_checkins_vs_gross

Get Glue goes on to explain the math behind this graph:

As you can see, there is a clear correlation between check-ins and box office dollars. The gray dotted line represents the average relationship between the two. For the mathematically inclined, to get the trend line we performed a simple linear regression and obtained an R2 value of 0.95. In other words, 95% of the variance in the data was explained by the trend line. A perfect correlation would have an R2value of 1.0.

I think this is fascinating. Get Glue also gets checkins to TV shows and music listening. It occurs to me that they could, with a fair bit of targeted recruiting and data cleansing, get to a fairly decent audience measurement service. I continue to be amazed by the power of the checkin.

#VC & Technology#Web/Tech

Comments (Archived):

  1. awaldstein

    Fascinating for certain.The TV space may be more interesting than the cinema area.Box office tells us who goes to the movies. But for live streaming TV, how do we really know who is watching what show and whether the TV is just on. Engagement level around TV is a black box that maybe the check in space can crack.

    1. JimHirshfield

      Time shift viewing adds a layer of complexity with tv, but seems like cable ccompanies know exactly what we all watch.

      1. awaldstein

        Time-shifted, pay per, on demand by definition I think are measurable as you need to initiate the stream. But, at least for now, network viewing appears to be a place where engagement via checkins may equal viewer data.Time shifted catalog viewing is still a frontier for community though. Your comment made me think of this post ( http://t.co/LcAbSlr ) I wrote over a year ago on the “True Blood” experiment that HBO did with SocialBomb. Funky and early but interesting.

        1. JimHirshfield

          Thanks for sharing your post Arnold – good read. Have you kept up with SocialBomb lately? How are they doing?

          1. awaldstein

            This has nudged me to ping them.The downside to what they were doing was that it was DVD based.The cool thing was that HBO broke their content into streams that could be shared as video snippets. I loved that.For movies I want to share video snippets. For example, I don’t want to check in and say “I’m watching Predator”, I want to share the video snippet when he says “If it bleeds it can die!” 😉

      2. awaldstein

        Thnx JimThat could mean that both checkins and  the Comscore’s of the world are not as relevant with a digital stream for viewer metrics.But I don’t think so. We need to go beyond pure traffic numbers to engagement and the checkin space has a good play here potentially.

    2. William Mougayar

      Very much agreed, Arnold. TV related data would be more telling than cinema.

    3. Alex Iskold

      TV is coming, stay tuned!

  2. JimHirshfield

    Check ins definitely tell a representative story. Same holds true for page views to movie listings pages, or as is the case with Tynt (my employer) the words people copy and paste.

    1. SD

       what population are checkins representative of? the Get Glue users? moviegoers? smartphone users? I am no statistician, but wouldnt it be impossible to know for certain if they are representative unless you knew both the composition of the sample (check in audience), and the composition of the overall population (the entire moviegoing public, or whatever else you are trying to measure).I think these tools are amazing, but lets be very careful about how well they predict.

      1. Greg Leman

        Actually, you don’t want to know the composition of the sample — you want a completely random sample.  The problem with this type of analysis is that it’s taking a completely self selected group (Get Glue users) and comparing it to a random sample.

        1. SD

          good point. But while you would want a randomly selected sample, you would like the sample to look basically like the underlying population: eg – if there population is 50% women, a sample that is 90% men is not representative, even if it is selected randomly.

          1. JimHirshfield

            All good points. I think you solve the random sampling problem by measuring implicit actions (i.e. where users don’t know their data is being aggregated and it’s anonymous), such as search referrer data or copy data. IOW, Google just has to look at how many searches are being done on a particular movie leading up to it’s release to determine how well it will do.

  3. kumarbshah

    Fred – fellow Wharton alum here. The analysis done by Get Glue is good, and is a step towards pulling valuable/monetizable information out of their data. However, the danger with making some sort of conclusion based on this data is:a) Contains only 9 data points (with 4 million check-ins, I am sure they have access to more?)b) Regression line created when 6 out of 9 data points lie are skewed towards one side of the graphOn a separate note, assuming the correlation exists, it would also be interesting to see how they can use this data for forward looking purposes.

    1. Dan Reich

      I was thinking the same thing. This data is fascinating but as someone with an engineering background, I too would like to see more data points.As for forward looking purposes – could be fun (and dangerous) to use this data to trade futures of box office outcomes.  

  4. Emmanuel Bellity

    Interesting stuff. I’m gonna look into their API cause it could work well with our iPhone app HeyCrowd (http://heycrowd.com) to survey people who have checked in and get some valuable feedback + advanced demographic data. Film distributors for example would love that.

  5. Sebastian Wain

    Since this information already existed as box office information It will be interesting to analyze more correlations. A simple example: how many people eat on Y after seeing film X.One question: since disclosing information to third parties is a difficult issue why we don’t have more economic incentives for using your data outside a particular site? so foursquare can share information with getglue?

  6. andyswan

    Very cool.  Reminds me of this story on how Twitter language digging could yield impressive predictive results:  http://bit.ly/cXvO9U   So is it the power of the “checkin” or the power of public announcements of individual choices?  Is there really any difference?p.s.  Get Glue is an awesome case study in a proper “pivot”.  As a small stakeholder in the company I was 90% they were “walking dead” at one point not too long ago…..and the only reason for the 10% of optimism was the extreme quality of their leadership.  Really brilliant and passionate people. I should have been more optimistic….it’s just another lesson pointing to the idea that investors are betting on the people.

    1. JimHirshfield

      Never underestimate @alexiskold 🙂

    2. fredwilson

      yes, it was a strong pivot

  7. Jefftrudeax

    Looks like the studios should push Something Borrowed a bit harder.  

  8. SD

    I think we need to see a lot more data points, along with a random sample. With this few points there is no way to tell whether the fit is coincidence. Additionally, it would be interesting to see if there is any model that helps explain both types of misses (overprediction, like something borrowed, and underprediction like pirates of the carribean) – is it demographics, weather, some other variable? I would suggest that services like get glue would benefit from supplementing these “user results” with a recruited panel, that is more representative of the population. This would provide additional insights.

  9. Druce

    Also easy to game. Goodhart’s Law – statistical relationships weaken when relied upon for business decisions, policy or regulation.Think AAA CDO squareds.That being said, if you take a well-chosen and monitored panel, give them a mobile, location-aware app, you could do a lot more than a traditional panel. Not just efficiently process larger panels, but even if they don’t check in, say, we saw you were at the mall, did you notice this store or promotion, and how did it make you feel. Combine it with video systems in the stores and malls to correlate online and real world actions, people who came for a Groupon, where did they go and what did they look at and what did they buy. Maybe they’re already doing that stuff.

    1. andyswan

      Good point.  The other way would to be more like Goldman:  Don’t publish results or even let people know you have them….just use them.There’s nothing to game if there’s seemingly nothing to game.

  10. carlos_ag

    This is really interesting.  I think the number of data points is not really an issue here, although there are some peculiarities about the graph: the graph is flipped from my intuition:  usually the independent variable is represented on the X-axis.  I would consider weekend gross to be the independent variable.  In other words, I think the fact that a lot of people went to see a movie generated a lot of mobile check-ins, not vice versa.  It may be in getglue’s interest to imply the opposite.There is another small caveat: some of these movies had less opportunities for check-ins relative to their weekend gross than others.  Really the proper metric to correlate here is number of opportunities (viewers of a movie) of movies relative to the number of check-ins.  Because Thor, Kung-Fu Panda, Rio and Pirates all had 3D offerings they had relatively few opportunities for check-ins relative to their gross since 3D tickets cost more than 2D tickets.I’m having a hard time interpreting what it means to be above or below the trend line.  Lots of people checked into something borrowed versus the gross, does that mean that it failed to convert on the little buzz it was getting? Anyone have any insight into this?

  11. Dave W Baldwin

    Good job GetGlue.The experiment above represents a move toward Real Time (keeping the bar high re definition).  The issue is NOT the above graph itself, but what will be offered in the near future.My advice to all readers with the HOT idea/dev is to use the ‘free’ model and gain the biggest number of customers.  The product with useful data re customers to match with data like above abolishes nit picking over the sample… you know who/where the sample is.@fredwilson:disqus , I’m going to send a quickie I sent to the West Coast 2-3 wks back… along with @JLM:disqus

    1. SD

      “nitpicking” about the sample goes to what its level of usefulness is for a specific purpose – predicting behavior, and making production investment decisions.Lots of companies have made the claim that if their data can help studios increase their hit rate, it would create billions of $ of value – by facilitating good content investments and avoiding bad ones.If the data is to be used in this way, it needs to be predictive.However, if the purpose is to create useful media for advertisers, then high volume and directionally useful data is all you need. I would argue that the volume of data created by Getglue, along with theunderlying audience (entertainment enthusiasts who have told you something about their tastes) is extremely valuable media for any entertainment market….a HUGE advertising category.

      1. Dave W Baldwin

        My use of nit picking was not meant to be too critical, just telling some of the readers to not be so trapped in the present/past.In the case of GetGlue, you do have the ‘current’ parameters to take into account.  In the case of what is coming to market over the next 3 yrs will be able to do quite a bit more.  I won’t go into the ‘bubble’ stuff based on IPO’s vs. the new that will be able to do a whole lot more.You are SO correct regarding HUGE advertising category, which really was my point.Bigger point is there will be all the claims of useful data offering prediction… those that go thru normal time frames of analyzing/deciding/executing will be left in the dust.  It will come down to who can deliver, offering track record (transparancy), doing the most with instantaneous curation.

  12. Guest

    I admit total ignorance as I have never used the service. When does the Get Glue check-in occur? At the movie theater (presence) or before (intention)? UPDATE: If the check-in occurs at the theater I do not understand the value proposition of building a predictive service. The Get Glue user is presumably already at the theater and has the ticket … the film distributor & studio can use the box office system data to know (predict) the sales. If the check-in occurs prior then I can understand.What is more interesting to me is the data points off the line. For example, what is it about Get Glue user base that generated loads of check-ins for something like Something Borrowed when the weekend gross was relatively small.I would love to see data just for Get Glue user check in for Opening weekends. Possibly test first adopter status. Hypothesis: Get Glue users are early adopters of tech how does this translate to movie attendance. Etc.I agree with a few other posts … more data points would be interesting and better.

    1. SD

      It helps with the advertising sales discussion (a sales person would say – our users track B.O. performance very well, so you should target these users when you are trying to promote new shows or movies).

      1. Guest

        Yes, I can understand that argument/line of thinking somewhat. It might allow a studio to gain confidence in the notion that ad placement targeted at Get Glue users could be productive. But I do not understand the predictive comment without more information about when the check-in occurs. What the data seems to be showing is that Get Glue users might be ‘average’ in the sense that they track with overall movie attendance/box office gross statistics. Again, more data points. If Get Glue is trying to develop monetization strategies I think there are some other things that could be measured and analyzed rather than what is proposed above. [UPDATE at 12:32PM Central U.S.: I did not receive my @disqus email telling me SD replied to my comment until 4 hours later – anyone else having delayed notifications?]

    2. JimHirshfield

      Short answer is GetGlue checkins data is quicker. Lord knows how delayed and archaic the systems are that collect box office sales. Also, keep in mind that GetGlue could expose or license this data to anyone interested – whereas box office receipts data is “controlled” by industry.

      1. Guest

        A slight delay for the industry is not that big a deal. If I have to wait a few more hours fine … it is not as though I can buy more time slots to run t.v. ads to alter. I recognize faster data might mean being able to juice other more real time marketing efforts though.Your last point is quite solid. Have to think on that some more. Who else really would gain substantially from an economic standpoint by having this data faster? You have intrigued me (especially since movies are on my Sh!t List at the moment because my little sister recently found out her scenes were cut from a new picture that was released).

      2. Pete Griffiths

        They may be archaic but they are certainly not delayed.  Studio execs can predict the film’s lifetime box office by the end of the opening weekend.

    3. ShanaC

      engagement over quality film?

      1. Guest

        I assume your reply @ShanaC:disqus is in regards to my comment about the data points off the line and my example of the ‘Something Borrowed’ data point. If not, please clarify. Yeah, your comment might be the case. It could also be the result of what I was alluding too earlier and others are commenting about…the fact that Get Glue users might not represent a true random sample. I don’t know what is happening with the ‘Something Borrowed’ data point but you bring up one good possibility.

        1. ShanaC

          No, you understand me correctly.

  13. Ciaran

    It might be interesting for TV, but for movies it’s kind of after the fact. They (movie companies) need to predict popularity not measure it. And they’re doing this with YouTube & Facebook data,

    1. ErikSchwartz

      The movie studios already have this really accurate measuring device called revenue.The issue I’ve had working with the GetGlue TV check APIs is they only let you check into a series, not a specific show. I can check into Law and Order, not L&O season 7 episode 12, so you can’t really build a proper loyalty program on top of it.

      1. Guest

        I have already admitted my ignorance on Get Glue specifics. Thanks for sharing some specifics that make me smarter on the topic Erik. Very good point you make to BTW.

      2. ShanaC

        What if you don’t know that information as the checkinee?

        1. ErikSchwartz

          Ideally you would need some kind of automated system for recognizing content on TV. Not to sound too self serving, but something like what SynchronizeTV is building.Of course if you want to use stuff like this for any kind of media monitoring where money is involved, automated content ID is crucial because otherwise the system is way too east to game.

          1. Aaron Klein

            This is why SynchronizeTV is going to be so cool.

  14. markslater

    Can someone explain how this makes money? predicting product performance? If i am the movie studio, do i not look at my rake per location as the true indicator? if i look at check ins and make a trending assumption – why not just go to revenue trending and come to the same conclusion. I’m confused.

    1. Guest

      Yes, that was my point. However, I am not sure when the check-in occurs. So, depending on when that action (e.g. the check-in) occurs the conclusion might change a bit.

    2. Tereza

      Mark, your are correct (as usual!).The predication that counts re: Box Office is after the first weekend’s run, when they renegotiate the length and price of the run.  They also use this to determine go-forward marketing spend (they either abandon the film or push it hard).  No reason not to use actuals.The Holy Grail in film is to be able to predict BO performance BEFORE it opens, so they can budget accurately for marketing and distribution.  Check-ins can’t help that, it’s management thru the rear-view mirror.  BO prediction is more about correlating similar content, calendar, etc., and also results from focus groups.Also i’m not sure that the check-in sample size at a specific location is large enough to be meaningfully predictive either (for the local movie theater’s/distributor’s benefit, not the studio’s).When I do check into something like a movie it’s a very small number of people checking in.  Small enough that the results would be kooky and could swing wildly based on who the audience is.  Most of the time when I check in to a normal life place I’m still by myself or one of very few.Incidentally I’d be very keen to know the check-in/BO relationship with the opening weekend of Bridesmaids because I think there was an unusually strong representation of females who are more tech-savvy than the norm and had been pulled in via online buzz.

  15. Ciaran

    As I said, there’s money in predicting performance, as it helps decide what level of marketing budget to put behind it, how many screens to show it on, etc… But I’m not sure this does predict performance.On the TV side, agencies and TV stations pay, a lot, to find out how many people have watched a show, as it helps determine future ad placement.

  16. Sllecks

    It seems apparent that the data from checkins will be huge.If collecting data is the goal, I do think that business-side check in is an interesting angle that doesn’t seem to be perused. Think of a restaurant checks in a patron when they are put on a waitlist. Collect, mine and extrapolate and you can get some super interesting insight to consumer behavior. Obviously there are some caveats that go with this, but its nothing someone clever can’t figure out.

    1. ShanaC

      If it weren’t for the overhead costs of a restaurant, it may be interesting to create apps for restaurants to allow them to do orders….

  17. Tereza

    Really interesting.  I spent a bit of time in past lives w similar using and blending similar data sets (e.g. @ Disney predicting BO performance, IRI w consumer scanner data for CPG supply chain forecasting and then of course selling TV ads).  I was the person they’d bring in to take this data and translate it into action in the field, at scale.I’m also a bit inconsistent as a check-in person.  Many of the people who are my demographic are similarly inconsistent with check-ins.  So if I put on my “how do I put this data to action” hat, I *think* — and would love to know — gender breakdown of checkins simply because I suspect women check in a lot less than guys (for safety reasons). That said, maybe Get Glue is a different beast. I’d need to understand this.That .95 R-square is awesome; it looks a bit like the male-skewed content fell above the line while Kung-Fu panda (more women with their kids) were under-checking in. As they develop this, gender as an analytic cut would be a very worth as in my experience evaluating this stuff and developing actions off it, there are often fundamental behavioral differences which, if blended, could lead you to the wrong conclusions.But all in all, very cool!

    1. ShanaC

      I think a lot of people are inconsistent check in people.  I know it may sound not nice_ I don’t want to know every detail of your life sometimes…OTOH, if Had checked in at every duane reade I went to Tuesday, I’m sure I would have heard back from customer service (they weren’t stocking sunscreen)

      1. Tereza

        If groups of people are inconsistent at the same rate across the sample set, then it’s not a problem that affects the data.It becomes an issue if some meaningful clusters/cuts of subjects are inconsistently inconsistent.

        1. ShanaC

          That may be the case with checkins because of how people want to appear/whatthey get out of it

  18. Renee

    This is interesting, but one thing I’ve wondered about with using check-in data to do this sort of thing is selection bias.  Doing statistical analysis of data gleaned from early adopters of a startup is probably going to result in data skewed by the preferences of the early-adopter demographic. 

    1. Guest

      Aye, that is a possibility. Echoing again the earlier calls by others … more data points would be prudent.

  19. RichardF

    I’m going to be honest I don’t really get check in behaviour.  I don’t understand what is going to drive main stream users to check in on a regular basis.  I gave up checking into foursquare ages ago because I didn’t see the point, no incentive whatsoever.To me it’s a bit like when a company emails you or you see a pop up on a website asking for “just ten minutes of your time to fill in an online survey about their product or service” yeh right because I’d really don’t have anything better to do.I can see Facebook winning this battle because telling people what entertainment you like fits in well with the stuff that get’s posted onto the average facebook profile page.

    1. Brad

      Richard I tend to agree with you. On top of that, I do not understand the capitalization of the projects either. How will they utilize this service to create value? Eventually company’s have to have more revenues than expenses. Whenever I see friends checking in on Facebook to the grocery store or otherwise I wonder why they are choosing to share this with me and what do I get from the information?The next question I would have is what percent of the people that went to those movies participated in checking in? If it is a small percent, is it really good information and can it be accurate?

      1. RichardF

        Like you Brad I have no idea why someone would want to broadcast they are at the grocery store.I wish Twitter would allow me to filter out foursquare “I’m at some coffee shop in middle earth” alerts

    2. Aaron Klein

      Same experience with Foursquare – the only initial value was seeing what places your friends were going and spontaneously finding them near you. Didn’t happen often enough.I’m not sure if you’ve noticed, but an awful lot of places are now starting to deliver value with the Foursquare check-in.Whether it’s enough to really scale the thing to mainstream behavior remains to be seen, but I wouldn’t bet against @dens getting this right.

      1. RichardF

        I haven’t noticed Aaron simply because I don’t use foursquare at the moment.The other thing is that I don’t live in a large metro area in the US.  I can believe that in New York or SF the foursquare experience is miles better.I do believe that sometimes you can be too clever for the mainstream.  One of the reasons that Groupon has been so successful is that it’s offering is easy to understand, open to anyone with email and offers a direct tangible benefit.Once they’ve filled their boots on IPO I think Groupon will turn their attention to loyalty and because they have a relationship with lots of retailers I think they will be successful.I’ve seen, first hand, how large high street retailers will trial new technology because they are always looking for new ways to engage with their customers.  McDonalds in the UK used ‘on pack’ sms campaigns massively a few years ago, they’ve dropped it now.Starbucks are similar.  I’ll bet they add their own check in element to the Starbucks app that incorporates a loyalty programme in the not too distant future. 

        1. Aaron Klein

          I don’t use it at the moment and I don’t live in an urban city either, although I’m going Android later today so I may give it another whirl.I agree entirely that Foursquare may be too cool for school to break into the mainstream, but it remains to be seen. I have to say that I love the idea of centralizing my loyalty + payment into a single app rather than a separate UI for each brand I interact with.Whether it’s @Square:twitter Card Case, Google Wallet, @Foursquare:twitter or something we haven’t thought of yet, I’d love to get rid of the weight in my wallet, which currently consists of one debit card, three credit cards and my Delta Medallion, Hilton HHonors and Starbucks cards.Imagine not having to carry around more than one physical card (for dead battery emergencies of course!)

    3. fredwilson

      these systems will ultimately be tightly integrated with rewards/offers/loyalty programs

      1. JamesHRH

        Fred – I agree totally with the loyalty aspect of checkins. When I heard about the Foursqare / Safeway trial, this light went on in a major way.It is interesting that you lead the post with comScore. I think check ins are complementary to ratings and not competitive. Do you? Does comScore ;-)All media properties have hardcore users (people who should maybe broaden their interests a bit!) and a media check in service that could be verified (with comments about the content being provided etc.) could be of real value to media operators / owners.

      2. Douglas Crets

        So, is that the end of those little keyring bobdoohickeys they used to use for loyalty tracking and inventory control that we used to get at supermarkets? 

        1. fredwilson

          i hope so

      3. RichardF

        completely off topic is that UPS sponsored box on your site producing relevant text based on the Disqus comments?

        1. fredwilson

          not yet

  20. Nick Gavronsky

    This is why check ins have become so powerful because its a way for anyone to directly connect and tie themselves to something in real time. Also, if it’s a brand, location, or event that they loyal to they are more likely to keep coming back. LBS have so much useful data in their hands now and even more so as check ins become more popular. 

  21. Douglas Crets

    More than that, they could use check-ins to sell records and cassette tapes. I’m being serious. GetGlue is the perfect service for location-based social selling of media. 

  22. Mark Shannon ONeill

    Is it not possible that the release of a blockbuster film could be the leading indicator of how many GetGlue checkins you’ll get?  One should not confuse correlation and causation. Nor should one confuse interesting with actionable.This strikes me as being similar to the stories (urban legends?) of water pressure drops in NYC due to toilet flushes at the end of the Super Bowl ( http://bit.ly/lHySLV ) or the infamous 1983 finale of M*A*S*H* ( http://bit.ly/iqa0n6 ).Get Glue users may be active movie-goers disproportionate to the general population. They are by definition “self selecting” and certainly not statistically representative of the population as a properly designed and executed random sample or panel would be – therefore not projectable with statistical reliability.”…can a checkin app be a representative sample for the purposes of measuring and predicting entertainment product performance?” Technically, no.That said, I bet I could sell that data to the marketing and promotions departments of any number of major studios.

    1. Alex Iskold

      It is certainly possible. The key thing here is the trend shows straight relationship between checkins and $. In the expanded blog post on our site we showed also another chart – checkins before the opening weekend vs. $ in the box office. The chart is similar in nature. This all suggests that social gestures like checkins are indicators of buzz and could potentially be used to predict the results in the box office.

      1. Bala

        Don’t be fooled by “Straight” relationships… they lie! there is nothing in this world that has a linear relationship.

  23. toddysm

    Hi Fred,This is very interesting correlation. I am wondering why #7 is so off the chart. How does Get Glue interpret the numbers for this outlier? 

    1. NICCAI

      Rom com?  Male to female ratio?  Wonder what the gender ratio is for check-ins?  Also, given that check-ins are likely tied to more tech savvy users, I wonder how ratings on sites like Rotten Tomatoes and Metacritic correlate?

  24. NICCAI

    Perhaps the correlation is between check-ins and buzz and not check-ins and box office?  It seems to me that social buzz is the true correlation, but we would need to cross-reference FB and TW to truly understand the effect of social sharing.  That said, the power of sharing  and word-of-mouth recommendations probably can’t be disputed as a major driver of returns.  Get glue’s check-in correlation also maps (what would seem) a nice slice of the representative market.

  25. Luke Toland

    This is an interesting analysis. As with any Ordinary Least Squares (OLS) regression, you have to take the findings with a grain of salt. There are a variety of conditions necessary to generate an accurate OLS.A very high R-squared is generally an indicator that the regression is poorly designed. Checkins and box office earnings are highly correlated. The more people that go to a movie, the more likely checkins are to occur. Plus, even if the relationship were valid, as the Get Glue user base expands, revenue per thousand checkins will like decay due to dilutive effects.What really needs to happen is to incorporate other metrics, like Tweets, Rotten Tomato score, Facebook Likes, number of reviews and advertising spend as well as Get Glue checkins and then run tests to determine which variable actually holds true. Otherwise, running the Get Glue regression in isolation is almost meaningless.

  26. Pankaj Prasad

    When looked at as an explicit interest indicator, the check-in gesture has a whole slew of measurement and predictive analytics possibilities that map users to the real world (via their smartphones).  The analysis of which really does become google analytics for the real world.  We’re starting to see the power of check-ins on the consumer side (like GetGlue, 4Sq).Within Enterprises, when paired with structured work relevant objects (beyond geo) this gesture provides significant personal and organization-wide insight as to who is working on what, where and why.

  27. Kevin Drost

    I’ve been doing a lot of work lately around improving the predictive capabilities of new content releases. This correlation is interesting, however, I would be careful as others have noted not to confuse correlation with causation. One interesting aspect is that you can look at activity in near real-time as opposed to the completely backwards looking nature of panels and focus groups. That said, check-ins really don’t give much predictive data in this context.There are a lot of interesting companies in this space, Marketshare Partners (http://marketshare.com/) is one I think is really interesting (I know elevation partners is an investor).

  28. jystervinou

    Do you think that a checkin is a poor man tweet ? It takes even less efforts than a tweet (zero characters ! 😀 ) but i think a tweet reflects the same behavior (i’m watching this).I run a French social tv site, and i compute live ratings based on tweets. ( http://devantlatele.com/aud… click on dots to see the corresponding program names)In fact, there is not much correlation (at least for now) between social tv ratings and oldschool tv ratings. The first mesure engagement of a program, the later raw viewing data (if people are really watching and not multitasking in fact 🙂 ).Based on these observations in the recent months, i think these social ratings are complementary, not a replacement.

    1. fredwilson

      yes, i think that is exactly what a checkin is

      1. Alex Iskold

        In addition to this, it is easier to grok if its focused on the vertical like TV. Structured data is always more understandable by mainstream than heterogenous data.

      2. Pankaj Prasad

        We call a Checkin a structured status message.  Tap to update rather than type.

  29. matthughes

    Very cool.I wonder if it’s viable for Get Glue (or others) to create an in TV app? (ala Netflix, FB or Skype…)

  30. Anthony Durante

    I’m not statistical genius, but I saw a flaw in this from the start.  Some of the folks who left comments provided some very rational statistical reasons why this is flawed, but to me, it was much more elementary.  The first question that popped into my my mind was “What percentage of viewers checked-in?”  So, for fun, I Google-d the average ticket price for a movie – which is surprisingly low at $7.89 in 2010 (I paid $10 to see Thor and Pirates 4 recently).  Divide the Box Office Gross by the average ticket price and you get the number of viewers – then divide the number of check-ins by the number of viewers and you get the percentage of viewers who checked in.  Sort the list by that value and everything changes!First scary fact:  All but one movie doesn’t even break the 0.2% check-in bar.Second scary fact:  The movie with the highest percentage of check-ins didn’t even break 1%.For anyone waiting breathlessly for the results, Something Borrowed has the highest percentage of check-ins at 0.9226% – Ranked 4th in number of check-ins and 6th for highest grossing movie.Last I heard, random surveys get 2-3% feedback.  I’m not so impressed with check-ins.  Sorry!

  31. William Mougayar

    Maybe I’m not getting the insight here, but all this says is- the greater the # people that go to see a given movie, the greater the number of them that will checkin when they see it. It’s a linear & proportional thing that does not need statistical formulas.They are already checking in by buying a ticket.What would be more telling of future box office success is sentiment anslysis on Tweets and Retweets AFTER they see the movie.

    1. leigh

      Get Glue should make a deal with the Scotia Bank Scene card here in Canada to somehow tie it to their movie points.  Executionally a bit of a nightmare i’m sure but to me it would be more interesting to see if a connected online check-in service drove incremental usage vs. control group of those without it.  

  32. William Mougayar

    I’m not disputing the power of the checkin because it is telling, especially in aggregate analysis. But it is more powerful when there is no other way to checkin. In this case, buying a ticket “is” the checkin.

    1. Alex Iskold

      For sure. What if you want to know how many people will buy the ticket? What if you want to influence how many people will buy the ticket?

      1. Bala

        Intend does not translate to the actual act of checking in or does it? I don’t checkin when I want to go to a movie. Maybe if more people “LIKE” a preview or do a +1 on a movie then that could be a predictor. However, I don’t think the internet of things follows a simple linear correleation, it follows a power law distribution and it is very hard to make any of the statistics work for you when a distribution of movies follows a very strong Power Law i.e. few movies take majority of the audience and if your representative sample is in the majority then you have no correlation

  33. vruz

    Get Glue could be the next Nielsen. And they should. Sometimes you start with a draft of an idea and the market takes you somewhere else, and if you listen carefully to the market, that somewhere is sometimes a great place to be.

  34. Cam MacRae

    Whilst Get Glue could become a fairly decent audience measurement service, statistically speaking this data isn’t all that interesting. On the one hand you’ve got a low quality, non-random sample of check-in data vs. a census of tickets sold. So whilst you may have discovered a feasible linear relationship it’s not all that useful post hoc.What would be useful as a predictor is a measure of intent: Tell me how many screens I need to run to maximise profit.

  35. Mark Essel

    It’s fascinating to see how fast data can identify anomalies (both good and bad). The gradients and trends of attention are a swirling tempest of value.

  36. Dogs

    This all suggests that social gestures like checkins are indicators ofbuzz and could potentially be used to predict the results in the boxoffice.

  37. Bala

    I think this correlation makes no sense to me Sorry… being a statistics buff this is just poor analysis, the data does not represent what you really want to measure but it looks like “the message” was fitted into the data which typically happens if you want to sell a story!Sorry Fred – Don’t buy the rating story, but I do believe promotions and getting people engaged in your content release through checkin works and it resonates with people. Of course when you give something for free to checkin then do you really have a “representative” sample? Just Sayin….

  38. Eric FD

    Fred – R2 is a flawed metric. In practical statistics, it doesn’t nearly have the impact one thinks it does. It is also EXTREMELY unstable and dependent on a few factors, such as the range of the X and Y axes. What this means is that if you had done this analysis on another slow week where all the box office gross (or check-in usage rates) are homogeneous low (or homogenously high), then the SAME TIGHTNESS OF THE ASSOCIATION would yield a lower correlation – even though predictively, the residual of the line is the same. Basically, I’m saying – take the results with a grain of salt. they cherry picked a good weekend to do this analysis, but this wont be stable from week-to-week or month-to-month, as rates of movie watching and rates of check-ins vary (only strong correlations on weeks with large range of receipts/check-ins).cheers,~Eric D, an epidemiologist and statistician from Cambridge, MA

  39. Eric Fader

    You should speak with Rob from momentfeed.com where they have a platform that easily aggregates and analyzes checkin data.

  40. Chris DeVore

    Hi Fred – don’t know how I missed this post in my feed, but the idea that observed affinity is the logical (and significantly more accurate + timely) replacement for panel-based measurement is dead-on. If you don’t know of them, take a look at Colligent (http://www.colligent.com) – an early-stage firm that’s doing this at impressive scale (e.g., 250MM profiles and 32K “brand objects” tracked)

  41. cook animial

    very possible and very cool.ThanksSam GoldbergCOO, PPM.net

  42. Juan Sagasti

    Here is an idea:  an mobile phone app that opens the mic for 1 second every 10′  and submit the sample to a central server . The audio (timestamped) is compared with the broadcasted tv channels and if there is a match you know what the phone owner is viewing. Bribe the users with paypal money or something virtual.If it works , buy me a pony.

  43. offeredlocal

    So, how about this, Nielson tells you I watched something on my tv etc. Check-ins tell you I was there and that I was actually compelled enough (read not embarrassed) to tell other people.  Now, which is a more powerful statement hmmmmm.  I suspect you’ll need to reevaluate the whole number crunching process because it isn’t really possible to compare old-school monitoring with check-ins.