The Rain and The Shade

December 23, 2012

Let us role a dice. (More predictable random numbers)

Filed under: C# Coding — ovaisakhter @ 11:24 am

If you are making web applications chances are that you have heard of requirements like

  • Every 5th user should see this survey
  • The advertisements should be shown randomly such that 20% visitors should see Advertisement 1 and 30% Advertisement 2 and so on.

When we are trying to solve this type of problem naturally type of solutions come to our mind are something on the lines of Create a central data store and keep that track of visitors and who was shown what and then create an algorithm to add some randomness in the behavior (well this what came to my mind within the seconds of hearing this problem).

When we closely look at this solution the first problem we can identity is it will slow down the website. Every time a page is shown you have to access a data store (may be a session) in case of multiple web servers a database  , if the site has considerably large page views then this can be a problem, as this will be a sequential read/write operation hence there will be locks and requests will be waiting for their turn. Despite of all this trouble there is a huge chance that you will still not be able to fulfill the requirements one hundred percent. At this point in time if we ask the business stake holders, they will tell the requirement was never for 100% accuracy anyway (yea they are more reasonable people, far more than we give them credit for). So we have established that 100% accuracy is not required, and the traditional solutions can seriously hurt the performance.

Now let us talk about something else i.e. probability. If you take a coin and toss it, there is a 50% probability for which side the coin will land on. If we keep on flipping the coin for lets say 100s of times roughly it will land 50% of times on each side. In the same way there is a 1/6 chance for each side of dice to appear while rolling a dice. Let us try to  use this knowledge to solve the above mentioned problem. Let us imagine that we can create a dice having different number of sides based on the requirements and are able to increase or decrease the chances of for each side to appear. Each time we get a request we can roll the imaginary dice and based on the appearance of the side we take a decision. This way each time we have to take a decision we will not have look inside data store we just roll a dice and respond, and with our knowledge of probability we can safely say that we will be roughly in the reasonably I acceptable range. If we have more than one servers we can put a dice on each server and we are good to go.

There can be many ways to incorporate the probability into your code one way of this is to using Random. Let us take an example. When a visitor comes to our site we have to shown her a string. 10% of the users should see a string “10%”, 30% of the users should see string “30%” and 60% of the users should see a string “60%”. When the user comes to  site we generate a number between 1 to 100. If this number is from 1 to 10 you show the “10%” string to the user, if the number is between 11 to 40 we return “30%” string and from 41 to 100 we return the “60%” string. The idea is that there is a 10% probability for a random number from 1 to 100 to be land between 1 to 10 and so on and so forth.

I wrote a class that takes a list of percentages and their identification variables(string in our case), and when asked for the next random returns the appropriate identification variable.

var choices = new List<WeightedRandomEntry<string>>
                                                            {
                                                                new WeightedRandomEntry<string>(10,"10%"),
                                                                new WeightedRandomEntry<string>(30,"30%"),
                                                                new WeightedRandomEntry<string>(60,"60%")
                                                               

                                                            };

            var generator = new WeightedRandomGenerator<string>(choices);

now I ask for the Next random for “TotalIterations” number of times and count each time a string was returned and in the end print the percentage of appearance of each string. and then do the whole operation again for 100 times. 

for (var j = 0; j < 100; j++)
            {
                var ten = 0;
                var thirty = 0;
                var sixty = 0;
                for (var i = 0; i < TotalIterations; i++)
                {
                    switch (generator.GetNext())
                    {
                        case "10%":
                            ten++;
                            break;
                        case "30%":
                            thirty++;
                            break;
                        case "60%":
                            sixty++;
                            break;
                    }
                }

                Console.WriteLine("{0}                        {1}                    {2}",Percent(ten),Percent(thirty),Percent(sixty));
               
            }

The snapshot shows the result of the above code

  

This snapshot show the result of this code you can see that the results are roughly within the expected range. There is less that 1% deviation from the requirements.

Now if we want to solve the every 5th user problem we can have an identifier “show survey”  with 20% and “Don’t show survey” with 80%, get the Next of each page view and show survey if “show survey” is retuned.  This approach can be used to solve most of the weighted random problems without using any data store and provide considerable performance.

You can download the code used in this example from the following link.

Advertisements

Blog at WordPress.com.