How can you determine the health of a community?
A lot of community managers just go with their gut on this one, or use proxy metrics like signups, posts per day, klout scores, retweets or some other metric that is fairly hollow, but there are better ways.
This is very much a work in progress, so I’d love to collaborate. If anyone has any thoughts, please jump in the comments sections and let’s discuss. That being said, most of this isn’t new, it’s just stolen, adapted and generally simplified from concepts like Network Theory, Affinity Groups, Clustering Coefficients, Small World Networks, and other things I will never fully understand or convince people to invest tech into.
Let’s dig in…
What is Network Density?
First off, Network Density (ND for short) isn’t one number, it’s more like blood pressure where they say “80 over 120”. I have no idea what the 80 or the 120 mean, but it works as an analogy. So, with that in mind, ND breaks down roughly as:
Average Distance Between Users : Number of Paths : Frequency of Interactions
or simply put…
AD : NP : F
Lets break each part down…
Average Distance Between Users (AD):
This is known as a small world network problem. It’s also been called the “average shortest distance”.
You want this number to be low.
Simply put, it’s the number of “hops” it takes to link one person to another. Think of it as the six degrees of kevin bacon – anyone is probably connected to anyone by some number of steps.
Generally the fewer the steps between 2 arbitrary people, the closer your community is and the easier it is for people to jump into new conversations.
For example, Bob starts a convo, Mary knows Bob and replies, Sam knows Mary so can jump in, and John knows Sam so he jumps in as well. John never knew Bob, but is now engaging with him because of your community connections.
The fewer people it takes to connect John and Bob, the better you generally are and the lower your AD will be.
In social networks this is easily defined as “friending” or “following”. In forums or gaming it’s a little more complicated, but you can use private messages, public replies, shared threads, guild or clan membership etc. Define it and play with it as is most useful to your community.
But, just because you have a low AD, doesn’t mean your community is healthy. You have to think about the health of the connections, so we add in the Number of Paths metric.
Number of Paths (NP):
You want this number to be high.
Having lots of backup connections is good. If you removed a key person from your community, what would happen? The more ways you can get from Person A to Person B, the less impact that loss would have.
In the earlier example, John and Bob connected with each other through the “path” of Mary and Sam. If you took Mary out of the community, you’d need a new path to connect John and Bob. The more of these backup paths, you have between your community members, the better.
But, just because you have a low AD and high NP, you could still have a low engagement community, which may or may not be “good”. So we take Frequency of Interactions into account.
Frequency of Interactions (F):
You want this number to be high.
My extended family is rather large, has a low Average Distance between any two people and a high Number of Paths between us (we all know each other and can reach each other in multiple ways)…but we don’t really talk all that often.
Is that a healthy relationship? It’s strong, but over time without many interactions, it really isn’t going to maintain it’s strength and resilience.
Think about your high school group of friends – low AD, high NP, but as soon as you all left school the Frequency of your communication might have dropped and the group became weaker.
The important part here is that it’s not absolute, it’s very much relative to your situation.
A community of 100 people with 10 total interactions may be “better” than a community of 1000 people with 100 interactions. You need to pick the right metric and scale for you and your goals.
Also, pick a time frame that works for you (interactions per day, week, month etc) and define what makes “an interaction” the way that is most meaningful to your goals.
And just to throw a wrench into the works, if you can, try to take quality into account (long, positive, substantive comments).
So, We Have the Formula… Now What?
This formula gives you a baseline. You can then track how your community is doing based on how that baseline changes.
Watch the change, not the number.
As with all metrics, the numbers themselves don’t really matter….it’s the direction of those numbers over time – Is AD generally going down while NP and F are generally going up. Are the numbers generally reflecting or indicating the changes you’re expecting or working towards in your community, etc.
There’s not really a “right” combination or single number that you should aim for. And it’s confusing as hell when you first measure…”What on earth am I looking at? Is this good????”.
F is fairly easy to understand, but AD and NP really confuse a lot of people. Is 4 : 10 good, or should it be 1 : 10000? To be honest, I’ll be damned if I know 🙂
Globally we’ve all heard of the “six degrees of separation” where AD=6, amongst movie actors the AD=4.54, and Facebook users are between 3.74 and 4.72 depending on the source. So, as a benchmark, below 10 and above 2 are probably “ok” but I’m just guessing, the “real world” seems to be around 4 or 5.
For NP it’s harder, partly because you can manipulate it – If you have AD of 3, then your NP might be 4. If you increase your AD reporting to 4 (remember higher AD is worse), there are LOTS more paths – it should follow a power-law distribution (I think).
To start, you can try tracking 2 “hubs” and see how the NP changes over time, you can take the AD at a set point in time as your benchmark (in money we often hear “$10,000 in 1990 dollars, same thing) or set an arbitrary AD number – “how many ways are people connected in fewer than 27 hops”.
Again, you have to apply it to your situation and use whatever makes most sense for your goals and needs.
Network Density will be Different Based on Community Types
Different platforms and different goals also lend themselves to very different Network Density numbers.
For example, a broadcast account on Twitter that has 3,000,000 followers might have an ND of 2 : 1 : 300 because every user is connected to every user through the broadcast account (AD=1), ONLY the broadcast account (NP=1) and people tend to reply to or RT it giving it a high(?) F.
Number of Paths=1 isn’t a problem because, for a broadcast account, when they are done with broadcasting it doesn’t matter if the followers continue interacting. Here, the most important metric is going to be F.
If I saw an ND of 2 : 1 : 300 in a community, the NP=1 would REALLY worry me and I’d say you really don’t have a community. But for broadcast or social, this is probably what you will see.
A gaming community, however, might have an ND of 4 : 300 : 2 because they’ve chosen to implement a guild system and count everyone in the guild as being connected. Not everyone in the guild talks to everyone else every day so F is low, but anyone in that guild could reach anyone else at almost any time (within 4 hops) if they needed…and anyone could be booted out from the guild without having a major affect on the health of the guild or the community.
As your Community Grows, Network Density will Change
The life stage of your community is generally also going to affect your ND.
Early stage communities almost always have a very low AD, an NP relatively high compared to their userbase, and a really low F. There just aren’t that many people (AD = low) and they all probably know each other (NP = high), but this probably isn’t their primary communication path just yet (F = low).
Growth communities are generally going to see a rapid increase in AD and drop in NP as a lot of strangers enter. This is natural and probably “ok” as long as the rate at which it is happening isn’t too high and the old-timers can connect to the new-timers fast enough to pass on the culture you spent so long building. The cool part is you can see this dilution happening and can take steps to address it, something you might not see or act on based on registered users or comments per user alone.
Mature communities are generally always going to have a high NP and F, because there’s just so much “stuff” going on. Here you may want to adjust the scale to your userbase size at some point – an F of 1,000,000 isn’t a particularly useful number, so finding other ways to track changes more granularly might be helpful.
Moving Forward…With Your Help
Here’s the best part, I have no idea how you’re going to gather the data to calculate these numbers. If anyone wants to be entrepreneurial and charge us all copious amounts of money for an easy solution to gathering this information…please let me know, I’d love to join that party 🙂
I’m genuinely interested in getting input to improve this thing, there are a lot of complications, nuances, sub networks and outliers when you start looking at network in this way, so please jump in the comments and help refine.
I’m clearly stumped on how to generalize frequency, particularly how you take quality into account and how to give it a meaningful definition – is it average frequency along your Average Shortest Distance, is it between a set group of people or set size, is it a direct calculation of all interactions with some amazing algorithm that sits on top of it (and if so, what is that algo), etc.
I don’t know, and would love help!
Oh, and Standard Disclaimer:
I do not claim to have thought of everything, for every situation, nor do I intend to. These things worked in at least one situation and are not guaranteed to work everywhere. Individual results may vary. There is no substitute for experience and timely, intelligent decision making. This discussion is designed to explore several aspects of community which, when used collectively, could produce positive results. (Disclaimer borrowed from Ray Land)
Justin is a community management consultant. He founded Communl which mentors and trains in-house community managers to make them more awesome.
He has been doing the community management "thing" since 2003, which has including everything from launching communities from scratch to running the largest actively managed community on the interwebs.
That was the sound of my head exploding. Â Going to have to marinate on this one, but I think you could be onto something interesting with the interconnectedness relative to frequency of participation. Â And it would step us all away from the “how many members do I need” lunacy. Â Which would be awesome.
“F is fairly easy to understand, but AD and NP really confuse a lot of people.” – Right! I think we all agree that most community managers or social marketers currently track F in some way. Could be as simple as the “# of interactions per post,” or a slightly more complex “% of members who contribute per month”, etc. It seems to me that the cutting edge measure of F, as you ruminated, Â includes some factor for quality or sentiment. Some high-profile monitoring tools are helping pave the way with this type of measurement. But ultimately “F” is something we appreciate and something that we can pretty easily track in some way.Â
AD and NP however are uber-elusive. WTF? I have thought about these things (mostly a broad conception that there should be some value to a more-networked or more-connected community) but the conception falls short as soon as you think about implementation: There are no easy ways to define interconnectedness (either AD or NP) via a broad, multi-platform brand community (i.e. determining those numbers across FB, Twitter, LinkedIn, YouTube). Or what about the community manager that manages a custom social platform (non-FB/Twitter/LinkedIn). There are likely even more road blocks to develop a custom methodology for tracking these measures in that environment.Â
If there cost-effective means to measure these things, I’m all in! I love the Network Density concept. Thanks for sharing and getting the head thinking (spinning?) a bit.
rosemaryoneill these things are done daily, i understand the concept but look at klout, tweetreach things like that. they measure how far a msg goes, its the same idea
mbhahn rosemaryoneill Not sure how klout or tweetreach relates to community health or network density.  Can you elaborate?
benfowler I’m thinking someone needs to build a community platform similar to fb groups, forums, ning etc… but with this kind of tracking built into the backend.  Make it accessible to anyone building community.
Density i thought he meant how far a message goes? In Klout, they measure the reach same thing, and tweetreach measures how many degrees a tweet was tweeted and how deep, Did A retweet cuz B tweets then B got C to tweet etc, Some idea for both
DavidSpinks benfowler FB is not a community platform.
mbhahn Actually, slightly different idea. Klout etc is about reach. This is about relationship.
Klout measures your ability to get retweeted or get your message to random other people.
Network Density measures the relationship strength of your community and has little to do with reach.
mbhahn DavidSpinks benfowler why is FB not a community platform? What about groups?
DavidSpinks mbhahn benfowler You have to give in on this one mbhahn – Facebook IS CONSIDERED a community platform for many brands. I get what you’re saying – there are other more truly collaborative/community-focused platforms, but Facebook, to PR, customer-service, or marketing departments, are definitely a community platform where real engagement happens.
DavidSpinks mbhahn benfowler You have to give in on this one mbhahn – Facebook IS CONSIDERED a community platform for many brands. I get what you’re saying – there are other more truly collaborative/community-focused platforms, but Facebook, to PR, customer-service, or marketing departments, is definitely a community platform where real engagement happens.
benfowler I don’t think it’s that difficult to measure in specific cases, but it would be very difficult to do in a general way.
I would argue that, even though I am not following you, this reply to your comment counts as a connection. If that’s the case (or if that’s the case that you’ve decided on) all you would need is a way to track the F between you and me. Then you can measure the AD and NP…and the interesting part here is that if you do this then you can actually put a “strength” value on each connection. Chatting with someone regularly and in depth means a boat load more than a Follow 6 months ago with no further interaction.
Also, I’d argue that you shouldn’t be measuring *across* platforms, only *within* them. You may have “one community” but each platform or discussion space lends itself to different interaction methods, even if it’s the same people, and therefore the scale and direction of change in your metrics will be very different for each. RTs aren’t a “thing” on this LiveFyre comment…so why would I apply Twitter network metrics to it?
As always, what you track isn’t really as important as the direction it goes. And with Network Density, it’s all about strength and depth of connections…so just define what a “connection” is within your community and measure that.
All that being said, I bet you could actually commoditize the measurement of this in the same way Google Analytics did it for site traffic.
justinisaf mbhahn I am trying to say they are similiar in the way of measuring the weight of a particular message. If bob posts a message and it gets shared by suzy and jen, then gets shared by their friends and moves down the line, its the same concept,  Its the content that determines the length of travel , organic vs non organic, were they coerced or was it natural share of info are variables too
Justin, I think this is great and would love to collaborate on it. I think the start of calculating NP and AD is using a mapping tool/database similar to http://nodexl.codeplex.com/ if you tied those records to a CRM system, you could even tease out specific qualities that your “top” community members exhibit. Could apply that in reverse and also see the qualities/interactions of people who churn. I’m going to play around with some data I have and see what comes out of it.
Also, for frequency, you could use a tool (that functions similarly to sentiment analysis tools) to give each conversation a grade. If it had a learning capability and manual override, even if it only had about 70% accuracy (roughly how well sentiment analysis does), it would be a good start.
Also, for frequency, you could use a tool (that functions similarly to sentiment analysis tools) to give each conversation a grade. If it had a learning capability and manual override, even if it only had about 70% accuracy (roughly how well sentiment analysis does), it would be a good start.
I think that after you map AD and NP some natural “buckets” will appear. The top 2% who make 80% of the conversation, the lurkers, etc. Then you calculate frequency for each bucket and if you want to boil it down to a single number for the entire community, do a weighted average of the “bucket frequencies” based on how valuable each bucket is to the health of the community.
celivingston NodeXL is a great tool, but I find it fails pretty quickly once you get into communities of any meaningful size. If you want to do real heavy analysis, look into neo4j which is a graph database which a few analytics programs fit on top of.
We had to move into biology modeling software because they were the only ones that could handle the size without crapping out.
celivingston It’s actually really interesting – the buckets that I think you are talking about actually emerge pretty quickly around interest, rather than participation.
Often you’ll have a large blob, a few very dense blobs within that and then a bunch of offshoots. They look like someone threw a bucket of paint at a wall. If you dive in, I bet you’d see interest groups rather than participation levels.
This is also VERY powerful for finding the “nodes” who are the genuine influencers within and between communities, rather than using something like “reach” which means very little in a real world scenario.
Would love to keep diving in on this!
This is really cool…how would you measure a Like on FB? Is that really an interaction? (I’d argue it’s a signal type one not bi-directional, ie: round-trip). How about bleedover via sharing with other platforms (ergo communities)? – Are there benchmarks established for well-known social networks?
Very interested in developing this further for external channels like Twitter/G+/Pinterest/Pheed and all the other usual suspects!
Thanks!
Interesting post Justin… I think the best thing each of us can do is come up with a simple measure we use for this week and then change it for next… early days you’re after people to join the community… we know roughly 1 in 1000 is going to engage at anytime so hey, aim to get 1000 people in so you can get a couple of folk talking to each other… then once the ball is rolling shift your attention to independent posts…Â
A one size fits all formula isn’t going to cut it… look at kred, klout, etc what do they actually measure… they measure those who choose to engage in social media… i’d like to think my surgeon is more interested in medical cures than tweeting how few stitches he used on the last op to get his sewing badge 😉
now an index of alternative measures based on evolution and market environments… that’s something we could all work on and invent together in a wiki type of way, where we’re all intrinsically motivated to have some fun with it and come up with models that actually float but don’t have a stock market valuation!
Now that’s community folks 😉
JeromePineau As always, I’d say if it works for your community, measure it…but you’re right: a “like” is generally a content signal, not a relationship signal, and it’s not reciprocal so that has to be taken into account if you do try to use it.
I’ll be honest, I’m not sure I’ve ever heard of ergo communities before, can you explain more?
I’m certain that there are numbers for the whole of large networks (FB has an AD as low as 3.74 I think), but that can vary wildly within subnetworks (my family on FB has an AD much lower than that for example). I also don’t know that anyone has done F and AP measures on most things.
Drop me a line if you do create metrics for networks, I would love to hear your results!
I’m in the middle of Lada Adamic’s Social Network Analysis course on Coursera, via University of Michigan. We use Gephi to visualize and calculate some of the ?’s that show up in this post – we gather that data and upload the .gml file into Gephi and it gives us tons of info. From there you can identify individual users, see their connections – identify who is the most important based on degree of centrality and also those connections who are bridges between different and potentially remote users/clusters in your network. Fortuitously, there are a number of resources that can be used for visualizing your networks, and from these visualizations – you can create a strategy
:Â http://selection.datavisualization.ch/
This is really interesting! We tried to do a similar thing with the attendee data at TEDxVancouver. We started by sending email invites to all the people that attended in past years. From there they could submit the emails of 1 or 2 friends that they thought would be interested in coming to the event, and so on.Â
On the day of the event we visualized this information on the nametags of the attendees. If you invited someone you received a diamond shape on your card, and their name below yours. If you were invited you received a hexagon shape on your card with the person who invited you listed below your name. It was a really interesting experiment in mapping and visualizing the community. I think that our AD was pretty low, even with 2000 attendees, because the majority of people were invited by someone else. Our number of paths for the day would have been high, especially since it was an in-person event, everyone could talk to anyone they wanted. Frequency of interactions is trickier to guess, but I assume we would have to take into account the number of breaks and how many people you could speak to in that time.
Justin – this is a fascinating concept, and I love how you’ve outlined the formula.  Have you seen @bernardohuberman’s  work in this area?  http://h30507.www3.hp.com/t5/Data-Central/What-makes-a-tweet-influential-New-HP-Labs-social-media-research/ba-p/81855 — some similar ideas to what you outline here.
tonia_ries I hadn’t seen it, but I’m looking into it now, it looks like some really fascinating stuff. If you’re looking for more in the area as well, definitely check out stuff by Dr Michael Wu over at Lithium who has some fascinating stuff and some of it might be public.
Thanks!
terakristen that’s awesome! I love it when offline and online intersect and metrics cross over 🙂
Was there anything that you guys put out on it, or have you followed up on it to build or strengthen the community online or offline? Is there any way I could get a copy of the data to nerd out on???
Being a graph nerd I love this article! justinisaf, you bring up a lot of great points about complexity and how much we don’t know about these things.
I’m actually exploring the measurement tool you talk about here. I’ve been researching with my CM friends but would love to start talking to the larger community… and well you asked 🙂
http://narratus.io is the launch page, but more importantly I really want to start a conversation and repository about measuring community. To this end I put up a forum: http://discourse.narratus.io. I’m a data guy, not a CM so please excuse my naive attempt at getting a community for community measurement started 🙂
I love this comment thread, it’s gold.
DataRiot Fantastic! I would love to partner with you to build out a CM analytics product. Drop me a line: hi@justinisaf.com and let’s chat!