< Hummus Corollary
Beautiful >

[Comments] (6) The Wages of Combinatorics: Trying to figure out the best way to present the licensing interface for NewsBruiser, I decided to see which Creative Commons licenses were the most popular. I ran a Google link: search for all 11 Creative Commons licenses, plus additional CC-provided licenses like the public domain dedication 'license' and the CC-GNU GPL and LGPL. I used the canonical license link: the URL I would link to if I were licensing something. It turns out that only four or five of the 15+ licenses are used by any substantial number of people.

The top five licenses are all Attribution-type licenses, with the most popular being Attribution-NonCommercial with nearly 25,000 hits. The least popular Creative Commons license, with only 13 takers, is the CC-GNU LGPL license.

License Design Implications

It looks like Creative Commons has enough licenses. The core licenses have proven very effective at meeting people's needs, and there's no need to keep minting new ones because as far as I can tell the new ones don't get used. This might indicate that the new licenses are created in response to "You should have such-and-such a license" feedback rather than in response to people who don't like any of the existing Creative Commons licenses.

For instance, consider the Creative Commons "Music Sharing License", which is effectively the same as Attribution-NonCommercial-NoDerivs. It's a lot more popular than its forlorn twin, but here "popularity" means 177 hits on Google instead of 16. It's dwarfed by its own commentary: Googling for "music sharing license" gets you 530 results talking about the latest license from Creative Commons. Compare to a string search for, eg. "Attribution-Noncommercial-Sharealike". You get people actually using the license, not talking about it. Creating a new license gets Creative Commons some buzz, but not much more; it looks like everyone who wants a license already has one they like.

The GNU licenses were probably created on behalf of GPL fanatics who kept bugging the Creative Commons people "why don't you have a GPL license?". Now that they're here, few people use them. My guess is, most people who want to use the GPL are using the actual GPL.

I didn't include them (or Music Sharing) in the survey because they're for music, but the URLs to the Sampling and Sampling Plus licenses are also mentioned almost nowhere on the web. This might be because no one uses them, because they're newer, because they don't get used correctly, or because they get used in ways that Google can't pick up. My methodology isn't as good for licenses not used by web pages (which probably also affects the GPL/LGPL numbers), so I will reserve judgement on these--but again, I searched for links to the URL I'd link to if I were licensing something under that Creative Commons license. The fact that the most popular licenses get tens of thousands of hits indicate that there's not some systemic disconnect where people don't link to the license they're using.

On the flip side, it does look like the CC public domain dedication is getting some good use--it's the sixth most popular license. I think this and the Founders' Copyright license (about which below) feed into a core competency of Creative Commons--being a trusted repository for copyright assignments.

UI Implications

If you want to let someone choose a Creative Commons license (in addition to whatever non-CC licenses), you don't need a bunch of orthogonal sets of radio buttons like the Creative Commons license picker has. You can provide four individual radio buttons and an 'other' field. This will handle 84% of the cases without greatly inconveniencing the other 16% of the people. (You want an 'other' field anyway, since people come up with all kinds of crazy non-CC licenses for their stuff.)

I couldn't measure the success of the Founder's Copyright because I couldn't find one canonical URL for the license. However, it's a good idea and I'm going to make it an option in NewsBruiser. This brings up my other UI design factor: in addition to the four most popular Creative Commons licenses, you can add whatever ones you like and think people should use. They'll be more likely to be used if they're options and not an 'other'.

Data below, if you're interested:
Relative popularity of Creative Commons text licenses
LicenseGoogle hits% of totalFirst result
Attribution-NonCommercial-ShareAlike2490031.63Photographers' and Illustrators' Artist Corners | Creative Commons
Attribution1030013.08Photographers' and Illustrators' Artist Corners | Creative Commons
Attribution-NonCommercial50606.43MSNBC - GlennReynolds.com
Public-Domain30803.91Mike Linksvayer
Attribution-NoDerivs14601.85Among Other Things
ShareAlike10601.35 ACM SIGWEB - Conferences
NonCommercial-ShareAlike7690.98Trademarks, Free Speech, and ChillingEffects.org
NonCommercial3230.41Search Engines Directory
NoDerivs1260.16Browse Top Level > Moving Images > Brick Films > LEGO
CC-GNU-GPL310.04blogkomm: download
Attribution-NonCommercial-NoDerivs160.02Photographers' and Illustrators' Artist Corners | Creative Commons
CC-GNU-LGPL130.02Carlsbad Cubes - Wolf Paulus' Web Journal

Filed under:


Posted by Leonard at Tue May 25 2004 14:24


Interesting data! To try and figure out what was going on, I decided to manually check the results for the sampling license. I chose it because it would be easy: it had relatively few results whether you used Google or Yahoo.

Yahoo did say it found 47 results, but it only offered 4 on the result page. Visual inspection indicated that the same site showed up in the #2 and #3 slots. Expanding the results indicated that there were lots of results for other pages in the website that was in #2 and #3.

Google had 2 results. One was new and the other was the one in the #4 slot on Yahoo. It seems there are 4 distinct sites that link to the Sampling license. Yahoo picked up 3 of them and Google picked up 2.

Google did miss a lot of the results, but its guess of 2 was, by perverse coincidence, a lot closer to the final result than Yahoo's guess of 47. It looks like Yahoo knows how to filter out multiple pages from the same site when it displays results, but not when it displays totals.

For Sampling Plus, Google says 17 results and displays 8 results from 5 different sites. (Not sure why your data says 0; are you doing an automated search that does something weird to the plus sign?) Yahoo says 65 and displays 20 links from 17 different sites. Google has 1 link Yahoo doesn't, so the total is 18. Again Google's number is closer, again by coincidence.

I think Yahoo is counting pages, not sites, in its total.
This makes it better for determining how many distinct *works* (eg. songs, weblog entries) are licensed under a given license. But its supposed totals are disproportionately large compared to the number of unique sites it actually gives you, which makes me think Google is better (though it undercounts) for determining how many *individuals* have chosen a specific license. If Yahoo changed its total display to be the total number of sites, it would become much more accurate for counting people but we'd have no way to count documents.

Does this make any sense?

Posted by Leonard at Tue May 25 2004 14:42

Follow-up comment tying up loose ends:

Thanks for reducing the complexity of the license picker. I and others thank you.

The license picker popup looks nice. I'll probably use it as an adjunct to the 'Other' field. I need to have 'Full copyright' as an option, and if I have that I want some less restrictive options displayed on the same level to encourage their use.

Do you think the Founder's Copyright is appropriate for a weblog? It seems like its high overhead makes it more suitable for something like a book. Or can you copyright the weblog once and have it applied to everything you ever post to the weblog?

I'll certainly let you know if I need any more help integrating CC licensing into NewsBruiser. It doesn't look like I will, thanks to all the documentation you have, but I'll let you know if I run into problems.

Posted by Mike Linksvayer at Tue May 25 2004 16:07

When I said above "Yahoo is complete" I meant complete for the subset of the web that Yahoo has indexed. :)

Posted by Leonard at Tue May 25 2004 16:16

Ok, makes sense. I'll do a chart of the Yahoo results as well.

I thought of a silly attack on the Framer's Copyright where I would create thousands of 'works' and sell them all to Creative Commons for a dollar each. Obviously Creative Commons would probably not fall for this. :)

[Main] [Edit]

Unless otherwise noted, all content licensed by Leonard Richardson
under a Creative Commons License.