Bryce Boe

The Adventures of a UCSB Computer Science Ph.D. Student

Skip to: Content | Sidebar | Footer

Using StackOverflow’s API to Find the Top Web Frameworks

21 February, 2011 (18:09) | General | By: Bryce Boe

Update 2011/02/23 11:02 PST
Added the lift tag and updated the list.

Update 2011/02/22 13:19 PST
Added the jsf tag (java server faces) and updated the total question count for each item on the list.

Update 2011/02/22 11:14 PST
Adding spring-mvc as that was what I originally was originally supposed to have.

Update 2011/02/22 10:36 PST
For the interested, here is the table used to generate the graph.

Update 2011/02/22 10:05 PST
As per comments on this post, I updated the list by removing hibernate, spring, and sass and added gwt and grails. I also updated the chart reflecting this information, and created an additional chart which plots the frameworks as a percentage of the questions asked each week to hide stackoverflows’s growing popularity.

Adam and I are currently in the process of working on our research about the Execution After Redirect, or EAR, Vulnerability which I previously discussed in my blog post about the 2010 iCTF. While Adam is working on a static analyzer to detect EARs in ruby on rails projects, I am testing how simple it is for a developer to introduce an EAR vulnerability in several popular web frameworks. In order to do that, I first needed to come up with a mostly unbiased list of popular web frameworks.

My first thought was to perform a search on the top web frameworks hoping that the information I seek may already be available. This search provided a few interesting results, such as the site, Best Web-Frameworks as well as the page Framework Usage Statistics by the group BuiltWith. The Best Web-Frameworks page lists and compares various web frameworks by language, however it offers no means to compare the adoption of each. The Framework Usage Statistics page caught my eye as its usage statistics are generated by crawling and fingerprinting various websites in order to determine what frameworks are in use. Their fingerprinting technique, however, is too generic in some cases thus resulting in the labeling of languages like php and perl as frameworks. While these results were a step in the right direction, what I was really hoping to find was a list of top web frameworks that follow the model, view, controller, or MVC, architecture.

After a bit more consideration I realized it wouldn’t be very simple to get a list of frameworks by usage, thus I had to consider alternative metrics. I thought how I could measure the popularity of the framework by either the number of developers using or at least interested in the framework. It was this train of thought that lead me to both Google Trends and StackOverflow. Google Trends allows one to perform a direct comparison of various search queries over time, such as ruby on rails compared to python. The problem, as evidenced by the former link, is that some of the search queries don’t directly apply to the web framework; in this case not all the people searching for django are looking for the web framework. Because of this problem, I decided a more direct approach was needed.

StackOverflow is a website geared towards developers where they can go to ask questions about various programing languages, development environments, algorithms, and, yes, even web frameworks. When someone asks a question, they can add tags to the question to help guide it to the right community. Thus if I had a question about redirects in ruby on rails, I might add the tag ruby-on-rails. Furthermore if I was interested in questions other people had about ruby on rails I might follow the ruby-on-rails tag.

Between the number of questions per tag, the number of answers per tag, and the number of followers per tag, StackOverflow provides a few metrics for measuring the relative level of developer interest in various web frameworks. Success! The next step was to extract these numbers for the tags of various frameworks. For this, I attempted to find StackOverflow tags corresponding to all the frameworks listed on the Best Web-Frameworks site I previously found. I skipped the framework languages CSS and Javascript as they aren’t server side frameworks. I then narrowed the list down to the frameworks which had at least 100 questions asked.

This produced the following frameworks sorted by total number of questions asked:

  1. (31156) ruby-on-rails
  2. (20587) asp.net-mvc
  3. (14951) django
  4. (4726) zend-framework
  5. (3510) jsf
  6. (3336) gwt
  7. (3296) cakephp
  8. (3127) codeigniter
  9. (2731) grails
  10. (1976) spring-mvc
  11. (1603) symfony
  12. (912) struts
  13. (538) kohana
  14. (515) pylons
  15. (514) sinatra
  16. (506) dotnetnuke
  17. (420) wicket
  18. (227) lift
  19. (194) yii
  20. (163) cherrypy
  21. (126) web2py
  22. (106) catalyst
  • (6609) hibernate (note: not a web framework)
  • (5765) spring (note: not a web framework)
  • (178) sass (note: not a web framework)
  • This list alone seems to work fairly well, however, I wanted to take it one step further which was to see the number of questions asked on a per week basis since the start of StackOverflow. Using the StackOverflow API (I used the API to generate the previous list too) I wrote a script to generate a CSV file containing this information. The information is depicted in the interactive chart below for the top 10 frameworks according to total number of StackOverflow questions. Each point in the chart represents the number of questions asked in a one week period starting on the date of the data point (protip: hover over chart to get the exact values).

    Note: if the above graph doesn’t load, try this static image.

    The data confirms my previous suspicion that ruby on rails is the number one MVC and that django and cakePHP would also appear in the top 10. I must admit that I had never before heard of asp.net MVC, however considering that stackoverflow and all other stackexchange sites run on asp.net MVC, it makes sense that it would rank quite high.

    I added the below chart to show the relative percentage of questions per tag over time as per Big Dave’s Gusset’s comment. This hides the growing popularity of stackoverflow.


    (Interactive version)

    The data for the above chart was extracted using the following script. The script requires the python package py-stackexchange in order to run and can be easily modified to add additional tags or change the filtering methods.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    
    #!/usr/bin/env python                                                           
    import datetime, sys, time
    from stackexchange import Site, StackOverflow
     
    frameworks = [# php                                                             
                  'zend-framework', 'cakephp', 'symfony', 'codeigniter', 'seagull',
                  'prado', 'solar', 'ezcomponents', 'kohana', 'jelix', 'flow3',
    	      'modx', 'sapphire', 'yii', 'limonade', 'tekuna', 'doophp',
                  'fat-free', 'akelos', 'php-on-trax', 'atk',
    	      # ruby                                                            
                  'ruby-on-rails', 'merb', 'ramaze', 'halcyon', 'sinatra', 'webby',
                  'sass',
    	      # perl                                                            
                  'catalyst', 'interchange', 'mason', 'cgi-application', 'jifty',
                  'gantry', 'dancer', 'mojolicious',
                  # java                                                            
    	      'struts', 'hibernate', 'spring', 'wicket', 'play', 'stripes',
    	      # python                                                          
    	      'django', 'pylons', 'grok', 'turbogears', 'web2py', 'cherrypy',
    	      # coldfusion                                                      
                  'cfwheels', 'coldspring', 'model-glue',
                  # asp.net                                                         
                  'asp.net-mvc', 'dotnetnuke', 'monorail', 'vici']
     
    class TagStats(object):
        DATE_START = 1217540572
        WEEK_SECONDS = 604800
        def __init__(self, tag_names):
            self.so = Site(StackOverflow, 'LzYJwh19o0WCIvXK9q6k6g')
    	self.tag_names = tag_names
            self.tags = []
            self.stats = {}
     
        def output_counts(self, html=False):
            tmp = []
    	for tag in sorted(self.tags, key=lambda x:x.count, reverse=True):
                tmp.append((tag.count, tag.name))
            if html:
                print '<ol>'
                for count, name in tmp:
                    print ('<li>(%d) <a href="http://stackoverflow.com/tags/%s">'
                           '%s</a></li>') % (count, name, name)
                print '</ol>'
            else:
                print '\n'.join(['%8d %s' % x for x in tmp])
     
        def get_tags(self, min_size):
            for name in self.tag_names:
                query = self.so.tags(filter=name)
                for tmp in query:
                    if name == tmp.name:
                        break
                else:
                    sys.stderr.write('Not found: %s\n' % name)
                    continue
                if tmp.count < min_size:
                    sys.stderr.write('Too few questions: %s\n' % name)
                else:
                    self.tags.append(tmp)
            self.stats = dict(zip([tag.name for tag in self.tags],
                                  [[]]*len(self.tags)))
     
        def output_stats_by_week(self, start_week=0):
            now = int(datetime.datetime.now().strftime('%s'))
            num_weeks = (now - self.DATE_START) / self.WEEK_SECONDS
            print ', '.join(tag.name for tag in self.tags)
            for i in range(start_week, num_weeks):
                sys.stdout.flush()
                start = self.DATE_START + i * self.WEEK_SECONDS
                end = self.DATE_START + (i + 1) * self.WEEK_SECONDS
                counts = []
                for tag in self.tags:
                    try:
                        count = self.so.questions(tagged=str(tag.name),
                                                  fromdate=start,
                                                  todate=end).total
                    except Exception, e:
                        sys.stderr.write('Stopped at week %d\n' % i)
                        sys.exit(1)
                    self.stats[tag.name].append(count)
                    counts.append(str(count))
                print ', '.join(counts)
     
     
    def main():
        try:
            start_week = int(sys.argv[1])
        except IndexError:
            start_week = 0
        tag_stats = TagStats(frameworks)
        tag_stats.get_tags(100)
        #tag_stats.output_counts(html=True)                                         
        tag_stats.output_stats_by_week(start_week)
     
    if __name__ == '__main__':
        sys.exit(main())

    Happy web-framework coding!

    Related Entries

Comments

Comment from Pedro Assunção
Time 2011/02/21 at 11:49 PM

I don’t think hibernate qualifies as a web framework. It’s an Object Relational Mapping framework: http://en.wikipedia.org/wiki/Hibernate_(Java)

You might want to double check the rest of the list :)

Comment from Bryce Boe
Time 2011/02/21 at 11:55 PM

Pedro- I did make one assumption which was the frameworks listed on bestwebframeworks were in fact web frameworks. Looks like that’s not quite the case. I’ll make a small note. Thanks for the heads up.

Pingback from Tweets that mention Bryce Boe » Using StackOverflow’s API to Find the Top Web Frameworks — Topsy.com
Time 2011/02/22 at 12:51 AM

[...] This post was mentioned on Twitter by å›§.史密斯 and Sun Ning, Bryce Boe. Bryce Boe said: New blog post: Using StackOverflow’s API to Find the Top Web Frameworks — http://goo.gl/Z5LNU [...]

Comment from Stephan Schmidt
Time 2011/02/22 at 1:23 AM

Your list looks a *little* bit abitrary. For starter Javascript is a nice server side framework base e.g. with NodeJS (769).

One might also add : JSP (4,232), Servlets (2,724), Grails(2,717), JSF (3,495), Lift (224) …

Best
Stephan
http://codemonkeyism.com

Comment from Christoffer Hammarström
Time 2011/02/22 at 1:41 AM

PHP is its own Web Framework. It would be interesting to know how popular PHP without a framework on top of it is.

Comment from Bryce Boe
Time 2011/02/22 at 2:32 AM

Christopher- I didn’t include PHP because IMHO it’s a language and out of the box doesn’t support the MVC framework. Additionally, because some frameworks such as zend-framework and cakePHP are php frameworks, some questions may be tagged with both the framework specific tag, and the php tag thus making the tags for languages somewhat unreliable for a direct comparison.

Nevertheless, as of a few minutes ago (relative to this post), the stackoverflow php tag has 88,666 questions which is significantly more than all of the frameworks listed. You could use the script I provided to additionally get the number of questions over time for the php tag.

Comment from Kissaki
Time 2011/02/22 at 3:22 AM

What about Joomla?!?
15*91 = more than 1300 questions.

Comment from Kissaki
Time 2011/02/22 at 3:24 AM

Mh, thinking about it maybe you excluded it because one could categorize it more as a CMS than a framework? (Although there’s an extendible framework below the default-CMS.)

Comment from Caley Woods
Time 2011/02/22 at 6:40 AM

SASS is also not a framework. It’s an extension of css3 that adds some things like nesting.

Comment from Dave Van den Eynde
Time 2011/02/22 at 6:50 AM

I think its suffice to say that what you have found out was not which web framework was the most popular, but which web framework gathered the most questions. I think there’s a subtle difference.

Comment from Enzam
Time 2011/02/22 at 8:52 AM

I agree with Dave, the data just show a rough combination of how popular and hard the frameworks are.

Comment from Dan Fabulich
Time 2011/02/22 at 8:55 AM

Add GWT, the Google Web Toolkit. 3300 questions.

Comment from Billy
Time 2011/02/22 at 9:04 AM

I think your hypothesis is wrong to begin with.
Asking questions about a framework does not make it *top* used framework, it just makes it hard to work with…

Comment from Salvador Diaz
Time 2011/02/22 at 9:04 AM

I’d like to point out that GWT is one of the most popular Java web frameworks and it’d be interesting to see how it compares to the ones in your list

Comment from JM Ibanez
Time 2011/02/22 at 9:05 AM

I have to agree with Dave Van den Eynde. It’s a subtle difference. Some frameworks might just make sense that there are a few questions to how to do stuff in it (or it might be that the framework is so lightweight that there aren’t many questions about how to use it).

For instance: comparing Sinatra to Ruby-on-Rails: Sinatra is very lightweight, so a lot of things that would have been asked wouldn’t. Plus Sinatra is a pure web framework (no persistence), unlike Rails — there would be a lot of questions tagged with Ruby-on-Rails, I expect, that are not just on the web layer end.

Comment from Davey
Time 2011/02/22 at 9:06 AM

As Dave Van den Eynde remarked, this only shows how many questions are being asked about a framework. Add to that the fact that a question about a framework is often a question about the language it uses.

So to rank popularity more accurately, perhaps it would be interesting to compare the useful information you have found for each framework, with the existing body of knowledge for both the framework itself and underlying language?

I think your information might be a good way of showing which framework has the most new users.

Comment from Bobby Martinez
Time 2011/02/22 at 9:07 AM

Hi Spring is not a web framework, and you missed out Grails which would come in at #8 if you’re going by questions tagged in SO http://stackoverflow.com/questions/tagged/grails

Comment from Big Dave’s Gusset
Time 2011/02/22 at 9:13 AM

I think your graph proves that StackOverflow is getting more popular. Perhaps you could weight the graph to take account of this?

Comment from Chuck Callebs
Time 2011/02/22 at 9:32 AM

I came to say pretty much what Dave did.

You can bet that the steeper the learning curve of each of these frameworks, the more questions they’ll have. For instance, a Rails user and a Sinatra user of similar programming ability walk into a bar — the Rails user will likely have more questions based upon the amount of convention and “magic” that goes on behind the scenes.

Alternatively, looking at it from a cynical view, there are a few other reasons why a particular framework would have more questions:

- The quality of programmers for the particular framework could be lower
- The framework itself could have many “gotchas”
- The community of one framework more actively uses stackoverflow than the community of others
- The community of one framework could be better at tagging their questions than the community of others

Ruby on Rails is obviously not the “top” web framework as far as usage goes. One can spend 5 minutes on a job search website to discover that. We can’t rely on SO question count to tell us much of anything except the SO question count.

Comment from alexander
Time 2011/02/22 at 9:44 AM

I’d be curious to see where Drupal would rank in this list. It’s got enough in the programmability department that a lot of people refer to it as a ‘content management framework’.

Comment from Colin Hawkett
Time 2011/02/22 at 10:25 AM

Perhaps you have shown that ruby-on-rails developers, out of all web programmers in all the world, know the least and speak the loudest? *duck* :)

Comment from Stuart B
Time 2011/02/22 at 10:27 AM

Another consideration: StackOverflow seems to be the go-to place for Q&A now, but when it was first getting started, some communities seemed to gravitate towards it more rapidly than others, notably ASP.NET MVC and iPhone development folks.

The graph over time helps with that, but total number of questions asked for all time will be affected.

Comment from Bryce Boe
Time 2011/02/22 at 10:35 AM

Kissaki- You are correct with your second comment. Jommla is more a CMS than a MVC-like framework.

Caley- I’ll make a note next to the original item. Thanks!

Dave/Enzam/Billy- Popularity is difficult to define precisely. Does it mean the framework most used to run various websites, or the most number of developers using it, or in this case the most number of questions being asked about it? As I stated originally, I wanted to compare the frameworks based on the number of websites using the technology unfortunately that didn’t seem feasible thus I opted to approximate that with number of stackoverflow questions. I would be very intrigued to see how this metric compares to others.

Dan/Salvador- Added a second graph which adds GWT. I previously didn’t realize GWT had server side components. Thanks!

JM- You make a good point regarding the ease of use of the framework. This popularity metric is by no means perfect.

Bobby- I’ll make a note about Spring and see the newly added graph which contains Grails.

Big Dave’s- The graph does show an increasing popularity in stack overflow. I just added a third graph to show the relative percentage of questions by framework each week. Thanks!

Chuck- You made some additional great arguments. I would love to see a graph comparing frameworks based on job postings, however, I personally don’t want to do it :)

Alexander- While Drupal is a web framework of sorts, I wanted to focus more on the CMS style frameworks. I realize some of the ones I selected don’t fall into this category; that’s the benefit of peer review :) Feel free to use my script to generate your own dataset.

Colin- Agreed. This is sufficient for the listing of top frameworks that I needed.

Stuart- I too figured that would be the case, however, I was surprised to find that the results don’t change much when just looking at the number for last week. See for yourself.

Comment from Kristian J.
Time 2011/02/22 at 11:49 AM

It might be interesting to investigate which frameworks are primarily used by hobbyists and which are used by professionals. An indication of this might be found by looking at the numbers for saturday and sunday compared to the rest of the week.

Comment from Felix
Time 2011/02/22 at 12:28 PM

I think you’re missing JSF (Java Server Faces). It has 3499 tagged questions and is definitely more important than Struts by now (and included in the JEE standard).

Comment from Bryce Boe
Time 2011/02/22 at 12:29 PM

Felix- I’ll add this in the next round. I couldn’t quickly find the tag for it. Thanks for the feedback.

Update: Java server faces has now been added.

Comment from jpobst
Time 2011/02/22 at 1:13 PM

Very interesting graphs!

My nitpick would be that not all questions are tagged with the generic framework tag and the version specific framework tag.

ie:
ruby-on-rails-3 tag has 4572 questions
asp.net-mvc-2 tag has 5527 questions
asp.net-mvc-3 tag has 1408 questions

I’m sure it’s too hard to account for that, but just something to keep in mind when interpreting the results!

Comment from Bryce Boe
Time 2011/02/22 at 1:16 PM

jpobst- You are absolutely correct. I had though about merging all the tags, and then decided against it, however I neglected to mention that in the post. Thanks for the feedback.

Comment from William Billingsley
Time 2011/02/22 at 6:17 PM

Could you add Lift (Scala’s web framework)? Foursquare and a number of other high profile sites run on it, and it’d be interesting to see whether it’s picking up much momentum (I’d guess it would rank a little above grails, but that’s just a guess).

Comment from Bryce Boe
Time 2011/02/22 at 6:21 PM

William- I can tell you right now that the StackOverflow lift tag has only 226 questions, thus is it would place 18th on the list and isn’t ranked high enough to include on the chart. You can see that number for yourself by visiting the tag’s page. The number is in the upper right under the search bar.

Comment from TheXenocide
Time 2011/02/23 at 6:12 AM

It’s worth noting that analyzing how often people ask questions about a given framework does not necessarily imply how popular it is. I’m just nitpicking in the interest of scientific accuracy as I’m sure the list here is at least reasonably useful to the desired extent, but it’s possible that Ruby on Rails or MVC are just more difficult to understand or accomplish certain tasks in than other frameworks. Additionally, there could be situations where one person has asked many questions about the same framework, which doesn’t necessarily indicate the framework is any more popular. Granted I’ve used them and I can apply my experience to refute this, but only as an opinion. I would be interested in framework usage measured by page requests across average users browsing, though I imagine these metrics would be very difficult to obtain it seems more useful to know which frameworks are handling the most real user traffic over which ones people are asking the most questions about.

Comment from Jens
Time 2011/02/23 at 10:07 AM

Like William Billingsley said, Lift would be nice. Twitter and many others use scala. Would be nice to see, how much it is used.

Comment from Bryce Boe
Time 2011/02/23 at 11:04 AM

Jens and William- Lift is now in the list.

Pingback from Using StackOverflow’s API to Find the Top Web Frameworks « Another Word For It
Time 2011/10/31 at 5:32 PM

[...] Using StackOverflow’s API to Find the Top Web Frameworks by Bryce Boe. [...]

Write a comment