Why is the estimation of system constraints needed in system design interviews? - architecture

I'm a junior software developer with no experience in system design and currently I'm preparing for a system design interview. I have read in many websites that I have to come up with some estimates of the scale of the system I'm going to design. In order to help me when I'm focusing on scaling, partitioning, load balancing and caching.
Examples of these estimates:
Number of total active users
Total number of requests per second
Total size of data saved per second and per year
My problem is that, I don't know what to do with these estimates?
In the interview, I can suggest adding a load balance and more hosts, add caching layer, add master slave database servers, ... and more, to increase the performance and I can discuss the trade-offs of each solution and which solution to choose in the future. But why should I to come with these numbers?
For example how will the number of requests per second affect my design?
Do hosts have a maximum number of requests per second? Do database management systems have a maximum size? I think that the answer is no to both questions and it depends on many factors and to determine the number of hosts or the need of database sharding or master and slave servers design in real life depends on practical experiments and testing. Please, correct me if I'm mistaken, thanks!

For example how will the number of requests per second affect my design?
10 requests per second, no need for a load balancer, 10,000 requests per second, you need to look at load balancers, stateless scaling, etc
Do hosts have a maximum number of requests per second? Yes, it's not absolute, but at some point you will need to architecturally design in more scalability.
Do database management systems have a maximum size? Practically, yes. It also makes you consider things like NoSQL options etc. If you are looking at petabyte-scale data, you should probably think twice about throwing it all in a relational database.
In real life, they aren't always cut-and-dry, but it can start the conversation, and focus it on the most important scaling factors.


How do you do website capacity planning? [closed]

I just read the book The Art of Capacity planning (BTW, I liked it), and in it the author explains how important is measuring your services, finding out your ceilings, forecasting your needs, ensure a easygoing deployment, etc.. etc.. But through the book he explains his experience in Flickr, where he has to face all the time the same product.
Lot of us, we work in companies where we face small-medium project sizes for other companies. We have to understand their business, their needs, plan an architecture, a model, etc.. etc..
Then, the customer says "I need to support 1000 users". Well, and how many requests per second is a user? how long are their sessions? how much data do they transfer? which operations do they execute? how long are they?
Sometimes it is possible to know those figures (monitoring their existing applications or because they have already done that measurements), sometimes it is not possible (because they do not have a current web site, or it is just to possible to know).
How do you make a guess about the number of servers, bandwidth, storage, etc... which figures of reference do you use?
Some points that you need to know to make this planning
How many users per day.
How many data you going to control.
How many data you going to show to each user.
Average user bandwidth that may need.
Average user time using your site.
The average numbers can give some idea what you need monthly. Of cource you need to think also the peak numbers - but when they rend web server computers and site they give bandwidth by the month and some gigabytes on hard disk, so the peak is not an issue at the start point. There you must think that if you run sql query that need too much ram, or if you share the computer with many other sites.
With out site, with out experience you do not have actually measures.
With out measures, you actually can not be sure but you can follow some guides
What ever you do, try to make the grow of your data/features/runs linear and not logarithmic.
The speed of your site is not (only) depend from the capacity and the speed of your computer. Is depend only when the computer is on his limits. If the computer is reach his limit, you add additional resource. But the speed must be take care when you design the software and the good speed software is costing also.
Do you have millions of data every day in the database ? you need more ram and hard disk
Do you have video and many big files to send ? you need more bandwidth.
Do you have people that using the site to work ? you need more speed and stability
Do you make one more e-commerce site ? you need more security with stability
The goal is to have them all, and the priority on what you focus first actually change.
Planning for speed.
Performance and Capacity: Two diffident animals*. The Performance is base on more human work, and the capacity is base on more computer resources. To make it speed you need first to know how to make the computer run smooth and fast, then to know how general tricks to make programs runs fast, especial the one on the web, and then you actually need to spend more time to the actually program after its run, to improve it for performance in the critical areas.
Planning for expand.
Make good software design and take care the possibility of expand in case that you may need more so to give to your client the opportunity to start with little, and pay more only if he needed it. So when you design your software think like you going to use it in a web pool, take care of the synchronization, take care of common resource, give the ability to get data from different servers etc.
Planning with limits
Ok, let say that the customer say that have only 1000 users and did not interesting nether for expand, nether for speed, and just need a cost effective site that do his job. In this case you also design it with this limits. What are this limits. You do not place tens of checks for synchronizations, and you make it work like a single thread, single pool program. You do not use any mutex, any double checks, any thinks that happens when you have 2 pools or 2 computers running the same application. You only note that points of code to change them in case that needs upgrade.
You also not made any code that use multicomputer resources. And when you run it you take care that is run only under one pool to work correctly.
This single pool design is more easy to develop, more easy to debug, easy to control, easy to update buggy code, and cost less, but suffer from speed (one user wait the other on one thread pool) and can not be expand in resource, that actually have to do also with speed.
Finding Statistics
If you do not know how many users you may have, you can use alexa to see similar sites with yours and the average users/ and average page views they have per month. Then you may know the possible bandwidth.
Don't buy before you needed it
Start with your prediction to hardware, but do not go and rent 2 computers from the day one. Start with the first, make your measures, see how data grow, and only expand it when you need it.
Car or Formula One ?
When the programs runs, if you follow it you can find many many thinks that need correction. I can say you only two from my life.
After we place the program online our customer starts to add data. After some months we notice the database grow too much - something that we did not expect it from the data enter. We spend almost one week to find why and fix it, it was a design error that make some statistics data grow logarithmic, we correct it and move on.
After two years of running we notice that we make too many un-necessary calls to SQL server. We trace it down and found again a design error, we correct it and we move on.
Actually we have found and fix many small points for performance every month. For me its like the formula one. You decide what car you have, a formula one that needs all the time correction to gain the maximum of it, or a simple car that only needs a yearly service ?
Customer Point of View
Then, the customer says "I need to support 1000 users" Well the customer did not know programming and try to find a measure from his point of view to compare proposals. Actually there are many more factor here and the 1000 users is not a correct parameter. Is 1000 users per day per minute or per month ? Are needed to suport with live chat, or needed to see large amount of data, or needed to work fast ? So maybe its up to you to sell correctly your program to the customer ether by explain to him that the good program is good the same for one user of for one million users, and actually the start of it is cost by the development and not by the users.
Now if this is a question for actually planning a site, then the simple end point answer is to start do it, and the rest will be reveal. If this is a question because you search answers for your client, then you must ask your self: why the Formula One have sit only for one and your car can fit five ? or how much a movie cost ? or we all knows how to write but why not all of us have write and publishes a book ? My point is that the cost is actually get from the time you spend to make the project, and the users by him self can not be determine that.
Guess, Knowledge or Prediction ?
How do you make a guess about the number of servers, bandwidth, storage, etc... We actually do not guess, we have many sites, we collect every day many statistics automatically, many years experience, and we know from the content of the site, how many users can have per day and how many bandwidth can eat. We also have many databases that runs on our servers and we can see how many data they use. For 99% of our sites all that are low numbers. So this is knowledge and experience, with real live statistics. The prediction come by monitoring the traffic and the use of them, we try to make them better, to get more traffic, more users, and from what we archive we try to predict if they need more resource in the future. Also 99% of the sites are single pool running very simple presentations.
'* From the book
Often this is very difficult since the system is not even designed when the customer is asking for the answer to this question. Which is actutally impossible.
As a very rough rule of thumb we use 100 requests per second per server. The actual number will vary depending on the application and how the users use the system, but we have found it a good first estimate.
The disk usage for a document system is just number of documents times average size. Bandwidth is number of requests times average size of requests.
You just document all of your assumptions and say that the hardware requirements are based on those assumptions.
While developing a recent Asp.Net MVC site, I used selenium to load test my site.
Basically you record a selection of macros, in which you perform random tasks.
Then use selenium to simulate a number of users performing those macros.
I tested my site with tens, hundreds and then thousands of users.
This allows you to find trouble spots in code and in infrastructure before going live.
which figures of reference do you use?
There is really only one figure that needs to be looked at, and then extrapolated on: data. All figures will derive from data requirements.
Small example: A billion requests per hour for an 8 byte binary number will not crash anything and could be run from the simplest of web servers. The reason for this is that the request time will be fractions of milliseconds. There are 1000 (ms/s) * 60 (s/m) * 60 (m/h) * 24 (h/d) = 86.4 million milliseconds in one day, meaning that even if each request took a full millisecond the 1 million required would still be available as the required speed for getting the 8 bytes would be in the 8kb/s range.
Real life version: Looking at the data will determine the requirements, and the data that is being retrieved is almost always in a database. The design of the database (even if conceptually) can help to determine how much data will be being used. There are multiple requirements in real life. The max capacity of the database, or filesystem, should be examined. This capacity can be calculated by looking at how much space each row of a table will require, by summing up the total space consumed by each column (i.e. an id of type int with length 6 will take 6 bytes or space). After summing each column of one row of a table, for each table in the database, it will be easy to tell how much memory each collection of tables will require (usually tables are linked through foreign keys). After the table memory consumption is considered, the users must then be examined for requirements. Mainly of interest is how many tables each user will be accessing per session (with no data this will be an guesstimation - best to overestimate). Because we already know, or have a good idea, what the size of the database tables are we can assume how much server memory the user will require. Comparing this memory usage to the amount of expected users will help to determine which server to use, or how many. Next to figure out is how many tables will be (again, on average guesstimation, or with some collected test data) inserted into the database as a result of user actions. This is very speculative and is best to be done with testing. Without testing, assumptions should be overestimated. Based off of how many rows each user will be inserting, it will be possible to extrapolate the database size and the bandwidth requirements. These will be determined through expanding the data requirement of one user, to the requirements of n users per t time. The data required by n users will make it possible to see bandwidth requirements over t time, and will also determine how n users will grow the database over t time.
In practice, we don't. We make sure we are able to rapidly expand (devops), have a possibility to fall back to using less resources/request, start with a very small number of users and observe performance. Most small-medium projects don't want to spend much time and money on this. For a large or critical project it makes sense to create and run simulations.
Remember, one day of planning costs as much as an extra machine for a year.
You use capacity to cover a number of non-functional qualities of system and are probably trying to encaspulate performance, capacity and scalability into one concept.
Lets start with performance and if you are dealing with a web based architecture, where you are serving resources then this is really quite straightforward and can be split into 2 different KPI's; server response time and page load time (should be called resource load time since not all resources on the web are web pages).
Server response time measures the time to last byte for a request on a given resource. Please note, that this is not inclusive of things such as content negotation. You (or the business) needs to specify the expected server response time for given types of resources. This is based on a single request/response e.g a response to a request for any resource that falls under the type of a 'Car Model', should take no more than 0.5 seconds, time to last byte.
Page load times take things one step further. Given a request for a resource, how long does it take to load that resource, along with any dependent resources. It really has more meaning when in the context of a Web Page. The Web being full of unknowns, makes this a bit of a grey area since all sorts of things come into play on this one (the network, the client, content negotation) so you need speicfy this given a fixed/stabilised network and client (there are all sorts of tools to achieve this). It should also always be defined as an average, without introducing concurrency issues (we are still not thinking about capacity yet).
Once you have specified both, you can start to determine the immediate capacity of your system i.e how many requests per second for resources can I make performantly (as specified above). There are loads of tools to help you define this. This will give you an immediate measure of capacity. You'll notice I use the term immediate because often the business might turn around and say, great, but what happens if we need to increase this capacity.
So we move onto the third non functional, scalability (n.b, there are more than 3 non functional qualities of a system, including availability, reliability, validity, usability, accessibility, extensibility, and manageability). Given a certain capacity, by how much can I increase it performantly. There is also sorts of ways to increase the capacity, but most systems by design usually have a bottleneck somewhere that creates a constraint.

design considerations for a WCF service to be accessed 500k times/day

I've been tasked with creating a WCF service that will query a db and return a collection of composite types. Not a complex task in itself, but the service is going to be accessed by several web sites which in total average maybe 500,000 views a day.
Are there any special considerations I need to take into account when designing this?
No special problems for the development side.
Well designed WCF services can serve 1000's of requests per second. Here's a benchmark for WCF showing 22,000 requests per second, using a blade system with 4x HP ProLiant BL460c Blades, each with a single, quad-core Xeon E5450 cpu. I haven't looked at the complexity or size of the messages being sent, but it sure seems that on a mainstream server from HP, you're going to be able to get 1000 messages per second or more. And with good design, scale-out will just work. At that peak rate, 500k per day is not particularly stressful for the commnunications layer built on WCF.
At the message volume you are working with, you do have to consider operational aspects.
Most system ops people who oversee WCF systems (and other .NET systems) that I have spoken use an approach where, in the morning, they want to look at basic vital signs of the system:
moving averages of request volume: 1min, 1hr, 1day.
comparison of those quantities with historical averages
error/exception rate: 1min, 1hr, 1day
comparison of those quantities
If your exceptions are low enough in volume (in most cases they should be), you may wish to log every one of them into a special application event log, or some other audit log. This requires some thought - planning for storage of the audits and so on. The reason it's tricky is that in some cases, highly exceptional conditions can lead to very high volume logging, which exacerbates the exceptional conditions - a snowball effect. Definitely want some throttling on the exception logging to avoid this. a "pop off valve" if you know what I mean.
Data store
And of course you need to insure that the data source, whatever it is, can support the volume of queries you are throwing at it. Just as a matter of good citizenship - you may want to implement caching on the service to relieve load from the data store.
With the benchmark I cited, the network was a pretty wide open gigabit ethernet. In your environment, the network may be shared, and you'll have to check that the additional load is reasonable.

How do you 'spec' a web server to support a given application?

I am having real trouble trying to get decent answers to my questions from VPS and dedicated hosting provider's sales people. I have a fairly simple set of requirements - how do I select a server spec / package and be confident that I have got it about right? Are there metrics to use - number of Http requests per minute for example? are there other benchmarks? How do you approach it?
Initial requirements are:
must support a private ASP.NET application supporting upwards of 200 users (possibly up to 1000). user activity will be largely continuous throughout the working day
the application is more intensive than your average 'website' (but not unduly so)
users will be uploading and downloading large files
requires a MS SQL Server database - (would workgroup edition suffice?)
must support another 5 public domains with low traffic levels and little to no database activity
Follow Up:
Thanks for the responses guys. I have got access to a system I can configure for profiling, so can anyone recommend any profiling / load testing tools?
You could add all the data you want here about number of users, etc., and you could even make it more quantitative than you have here - it's an impossible question to answer in this way.
The real answer is: profile, profile, profile. You must measure how the application and its database behave to make any kind of determination about the resources needed to support N users at a certain level of activity. If you really know or have reason to believe that your software will have non-trivial load right out of the gate, then my best advice to you is to look into load testing tools and services.
Your requirements are too vague to make a decision. For example, what is a large file? What is a low traffic level?
One of the easiest ways to get it right is to just setup a test server + network, and try and simulate a normal load.
If you can't setup a test environment, you are probably just stuck with guessing about the correct size, and then testing on the real system, and adjusting your service level as needed.
You need to profile the load you are expecting. Then, you should choose a provider that can meet your anticipated demand and provide a path for growth. It's a lot easier growing if your colo provider can handle the growth gracefully.
That is, if your project is the biggest one your provider has ever done or requires more sophisticated hardware than they are used to, you need to look for someone bigger.
When you know what you want, be very specific, but not overly aggressive with your requirements. For example, if you need 8gb of memory, say 8gb. Don't say 4gb required but 16gb would be nice.
I'd also request quotes on two systems: what you need today and what you might need in a year.
You're on the right track with http requests per second. Also look at disk io and memory usage by IIS. They need to know how much traffic their hardware can handle.
SQL Workgroup edition should work for you - it works up to 3GB of memory.

In terms of today's technology, are these meaningful concerns about data size?

We're adding extra login information to an existing database record on the order of 3.85KB per login.
There are two concerns about this:
1) Is this too much on-the-wire data added per login?
2) Is this too much extra data we're storing in the database per login?
Given todays technology, are these valid concerns?
We don't have concrete usage figures, but we average about 5,000 logins per month. We hope to scale to larger customers, howerver, still in the 10's of 1000's per month, not 1000's per second.
In the US (our market) broadband has 60% market adoption.
Assuming you have ~80,000 logins per month, you would be adding ~ 3.75 GB per YEAR to your database table.
If you are using a decent RDBMS like MySQL, PostgreSQL, SQLServer, Oracle, etc... this is a laughable amount of data and traffic. After several years, you might want to start looking at archiving some of it. But by then, who knows what the application will look like?
It's always important to consider how you are going to be querying this data, so that you don't run into performance bottlenecks. Without those details, I cannot comment very usefully on that aspect.
But to answer your concern, do not be concerned. Just always keep thinking ahead.
How many users do you have? How often do they have to log in? Are they likely to be on fast connections, or damp pieces of string? Do you mean you're really adding 3.85K per time someone logs in, or per user account? How long do you have to store the data? What benefit does it give you? How does it compare with the amount of data you're already storing? (i.e. is most of your data going to be due to this new part, or will it be a drop in the ocean?)
In short - this is a very context-sensitive question :)
Given that storage and hardware are SOOO cheap these days (relatively speaking of course) this should not be a concern. Obviously if you need the data then you need the data! You can use replication to several locations so that the added data doesn't need to move over the wire as far (such as a server on the west coast and the east coast). You can manage your data by separating it by state to minimize the size of your tables (similar to what banks do, choose state as part of the login process so that they look to the right data store). You can use horizontal partitioning to minimize the number or records per table to keep your queries speedy. Lots of ways to keep large data optimized. Also check into Lucene if you plan to do lots of reads to this data.
In terms of today's average server technology it's not a problem. In terms of your server technology it could be a problem. You need to provide more info.
In terms of storage, this is peanuts, although you want to eventually archive or throw out old data.
In terms of network (?) traffic, this is not much on the server end, but it will affect the speed at which your website appears to load and function for a good portion of customers. Although many have broadband, someone somewhere will try it on edge or modem or while using bit torrent heavily, your site will appear slow or malfunction altogether and you'll get loud complaints all over the web. Does it matter? If your users really need your service, they can surely wait, if you are developing new twitter the page load time increase is hardly acceptable.

Asp.net guaranteed response time

Does anybody have any hints as to how to approach writing an ASP.net app that needs to have a guaranteed response time?
When under high load that would normally cause us to exceed our desired response time, we want to throw out an appropriate number of requests, so that the rest of the requests can return before the max response time. Throwing out requests based on exceeding a fixed req/s is not viable, as there are other external factors that will control response time that cause the max rps we can safely support to fiarly drastically drift and fluctuate over time.
Its ok if a few requests take a little too long, but we'd like the great majority of them to meet the required response time window. We want to "throw out" the minimal or near minimal number of requests so that we can process the rest of the requests in the allotted response time.
It should account for ASP.Net queuing time, ideally the network request time but that is less important.
We'd also love to do adaptive work, like make a db call if we have plenty of time, but do some computations if we're shorter on time.
SLAs with a guaranteed response time require a bit of work.
First off you need to spend a lot of time profiling your application. You want to understand exactly how it behaves under various load scenarios: light, medium, heavy, crushing.. When doing this profiling step it is going to be critical that it's done on the exact same hardware / software configuration that production uses. Results from one set of hardware have no bearing on results from an even slightly different set of hardware. This isn't just about the servers either; I'm talking routers, switches, cable lengths, hard drives (make/model), everything. Even BIOS revisions on the machines, RAID controllers and any other device in the loop.
While profiling make sure the types of work loads represent an actual slice of what you are going to see. Obviously there are certain load mixes which will execute faster than others.
I'm not entirely sure what you mean by "throw out an appropriate number of requests". That sounds like you want to drop those requests... which sounds wrong on a number of levels. Doing this usually kills an SLA as being an "outage".
Next, you are going to have to actively monitor your servers for load. If load levels get within a certain percentage of your max then you need to add more hardware to increase capacity.
Another thing, monitoring result times internally is only part of it. You'll need to monitor them from various external locations as well depending on where your clients are.
And that's just about your application. There are other forces at work such as your connection to the Internet. You will need multiple providers with active failover in case one goes down... Or, if possible, go with a solid cloud provider.
Yes, in the last mvcConf one of the speakers compares the performance of various view engines for ASP.NET MVC. I think it was Steven Smith's presentation that did the comparison, but I'm not 100% sure.
You have to keep in mind, however, that ASP.NET will really only play a very minor role in the performance of your app; DB is likely to be your biggest bottle neck.
Hope the video helps.