The Rain and The Shade

December 23, 2012

Let us role a dice. (More predictable random numbers)

Filed under: C# Coding — ovaisakhter @ 11:24 am

If you are making web applications chances are that you have heard of requirements like

  • Every 5th user should see this survey
  • The advertisements should be shown randomly such that 20% visitors should see Advertisement 1 and 30% Advertisement 2 and so on.

When we are trying to solve this type of problem naturally type of solutions come to our mind are something on the lines of Create a central data store and keep that track of visitors and who was shown what and then create an algorithm to add some randomness in the behavior (well this what came to my mind within the seconds of hearing this problem).

When we closely look at this solution the first problem we can identity is it will slow down the website. Every time a page is shown you have to access a data store (may be a session) in case of multiple web servers a database  , if the site has considerably large page views then this can be a problem, as this will be a sequential read/write operation hence there will be locks and requests will be waiting for their turn. Despite of all this trouble there is a huge chance that you will still not be able to fulfill the requirements one hundred percent. At this point in time if we ask the business stake holders, they will tell the requirement was never for 100% accuracy anyway (yea they are more reasonable people, far more than we give them credit for). So we have established that 100% accuracy is not required, and the traditional solutions can seriously hurt the performance.

Now let us talk about something else i.e. probability. If you take a coin and toss it, there is a 50% probability for which side the coin will land on. If we keep on flipping the coin for lets say 100s of times roughly it will land 50% of times on each side. In the same way there is a 1/6 chance for each side of dice to appear while rolling a dice. Let us try to  use this knowledge to solve the above mentioned problem. Let us imagine that we can create a dice having different number of sides based on the requirements and are able to increase or decrease the chances of for each side to appear. Each time we get a request we can roll the imaginary dice and based on the appearance of the side we take a decision. This way each time we have to take a decision we will not have look inside data store we just roll a dice and respond, and with our knowledge of probability we can safely say that we will be roughly in the reasonably I acceptable range. If we have more than one servers we can put a dice on each server and we are good to go.

There can be many ways to incorporate the probability into your code one way of this is to using Random. Let us take an example. When a visitor comes to our site we have to shown her a string. 10% of the users should see a string “10%”, 30% of the users should see string “30%” and 60% of the users should see a string “60%”. When the user comes to  site we generate a number between 1 to 100. If this number is from 1 to 10 you show the “10%” string to the user, if the number is between 11 to 40 we return “30%” string and from 41 to 100 we return the “60%” string. The idea is that there is a 10% probability for a random number from 1 to 100 to be land between 1 to 10 and so on and so forth.

I wrote a class that takes a list of percentages and their identification variables(string in our case), and when asked for the next random returns the appropriate identification variable.

var choices = new List<WeightedRandomEntry<string>>
                                                            {
                                                                new WeightedRandomEntry<string>(10,"10%"),
                                                                new WeightedRandomEntry<string>(30,"30%"),
                                                                new WeightedRandomEntry<string>(60,"60%")
                                                               

                                                            };

            var generator = new WeightedRandomGenerator<string>(choices);

now I ask for the Next random for “TotalIterations” number of times and count each time a string was returned and in the end print the percentage of appearance of each string. and then do the whole operation again for 100 times. 

for (var j = 0; j < 100; j++)
            {
                var ten = 0;
                var thirty = 0;
                var sixty = 0;
                for (var i = 0; i < TotalIterations; i++)
                {
                    switch (generator.GetNext())
                    {
                        case "10%":
                            ten++;
                            break;
                        case "30%":
                            thirty++;
                            break;
                        case "60%":
                            sixty++;
                            break;
                    }
                }

                Console.WriteLine("{0}                        {1}                    {2}",Percent(ten),Percent(thirty),Percent(sixty));
               
            }

The snapshot shows the result of the above code

  

This snapshot show the result of this code you can see that the results are roughly within the expected range. There is less that 1% deviation from the requirements.

Now if we want to solve the every 5th user problem we can have an identifier “show survey”  with 20% and “Don’t show survey” with 80%, get the Next of each page view and show survey if “show survey” is retuned.  This approach can be used to solve most of the weighted random problems without using any data store and provide considerable performance.

You can download the code used in this example from the following link.

Advertisements

March 17, 2012

Visual Studio Add-in for code review

Filed under: C# Coding,Code Quality — ovaisakhter @ 9:16 am

Background

We do use static code analysis on our code and use Resharper as well which I am not sure where to place category wise but surely it is the best tool next to visual studio a developer can have. Although these tools have made the life of  the Architects lot easier,  a manual code reviews (if I may say it so) are still very relevant.

Some while back I had to do a code review for a decent sized project.  For some one as Lazy as I am it is a struggle to write down the review reports. It is a pain to write down the context information like project, file, line number and then some selected text to for each defect report. I  Googled for a  tool which can make my life easier, but most of the tools I found were a bit to complex for my need and on top of that most of them were coupled with some particular source control (not to mention they were not free also).

The Add-In

Eventually I thought of writing a very simple version myself. Here are the requirements for this simplest version

  • should be able to show it in visual studio
  • should be able to write a description of the issue
  • should get the context following context information
    • Solution Name
    • Project Name
    • File Name
    • Line Number
    • Selected Text
  • should be able to append these reports to a text file

In this post I will try to introduce you the addin and its capabilities which are not many and how to install it. I will explain a bit more about the code in my next post. I will how ever put a link to the source code in the end if you want to look at it.

Once you install the plug in you get a menu item in your tools menu “Code Review” once this item is clicked a visual studio tool window is loaded with the code review form. You can dock this window in visual studio for easy access. Here is how it looks like after docking.

image

So the plug in offers you some very basic capabilities. Once loaded you can select a text file where all of your review will be saved. You can go to a file and may be select some text and on the window start writing an issue description and puff the above mentioned context information appears in the context box. once done with the description  you can click the append to file button and the description along with the context information will be saved in the text file.

image

 

and when you click append to file here is what is saved to the file

Code Review Defect Report
**************************
Description
***********
this code generated by visual studio really stinks, well not really I am just doing to create a fake code defect report
Context
*******
Solution: C:\Users\Ovais\documents\visual studio 2010\Projects\Wisdom.VisualStudio.Tools\Wisdom.VisualStudio.Tools.sln
Project: Wisdom.VisualStudio.TestApplication
File: Program.cs
Line: 19

Selected Text: Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
Application.Run(new Form1());

 

Installation

I have not made any fancy installation setup with this so you will have to do the following steps manually

  1. Copy the contents of the install.zip to a folder on your computer e.g. c:\Wisdomplugins
  2. in the visual studio go to tools menu and then select options
  3. in the options dialog select Environment/Add-in/Macros Security
  4. Click the add button and provide the path of your newly created folder.
  5. restart the visual studio and with a stroke of luck you will see the menu item in the tools menu

I have only tested this on the English version and I am pretty sure it will not work on any other language Smile.

You can get the code for the Add-in from the following location.

http://ge.tt/4X0Mm7F?c

As I mentioned I will explain the code of the Add-in in a later post. So if you are interested stay tuned.

Happy reviewing!

February 8, 2012

Peeping into very large text(xml) files

Filed under: C# Coding,PowerShell — ovaisakhter @ 4:47 pm

Recently I worked with a 40 GB XML file. The main objective was to look at the data in the file and create a compatible data model and eventually write a routine which can import the data into the database.

To start with I used a virtual machine hosted some where in the cloud for all my work . This approach made my life much easier as

  • The long running processes were not slowing down my own computer and I can do something else while the processing was done
  • I can just turn off my PC and go home without having to kill the process
  • I got a much faster machine in the VM

I tried to open the file in some known editors but there was no result and most editors froze. My initial objective was to just into the some part of the document so at least get an idea of what type of data I have to work with.  My initial thoughts were write a small program using XMLReader in C# and dump some part of the file into another file.

Just before writing the code I stumbled upon the Get-Content command in the Windows Power Shell. Get-Content command mainly lets you open a text based file and let you do basic operations. so if you want to show a file you will write something like

Get-Content .\1.txt

on the Windows Power Shell command prompt. This will print all the content of the file on the console. Not much useful is it?

Now if you want to display first 10 lines of this file you can give following command

Get-Content .\1.txt -totalcount 10

The best thing about this command is that it will not load all the file in the memory rather it will only load the specified lines. You can very easily save the out put of this command to a file like following

Get-Content .\1.txt -totalcount 10 > new1.txt

You can get more information about this command from the following link

http://technet.microsoft.com/en-us/library/ee176843.aspx

The link provides some other parameters used with Get-Content where you can count the number of rows or read last n number of rows but you can not use these if you are trying to work with very huge files as all of them will load the file in the memory and then will perform the requested operation, which will kill the purpose of using the command in the first place.

August 17, 2011

Unit Testing Database driven functionality with ADO.Net Entity Framework Code First

Filed under: ADO.Net Entity Framework,Unit Testing — ovaisakhter @ 5:39 am

The basic principal of writing a unit tests is that when you do a transaction you know the “expected” result and then you compare with the actual result that was generated, and if these results match you have a passed unit test. 

It is always interesting to write unit tests for Database services. Usually the problem that comes to mind is how to exactly predict the state of the database at the time when unit tests are running, so that the unit tests know what to expect. Some people may argue that you can mock out all the database functionality (mock the repositories), this idea may be very applicable in some scenarios but I usually do not go this way because of two reasons, 

  • It seems like to much work
  • Our applications are mostly data driven applications so there is not much to test anyway other than database transactions.

One way of knowing the exact state of the database can be to initialize(create) the database every time the unit tests are run. Once the database is created we might want to insert some initial data which is required for the unit tests.

I have been using Entity Framework Code first in my recent projects. Entity Framework provides you a very good way of achieving the two objectives I have mentioned in the previous paragraph.

EF provides a concept of Database initializers, these are implementations which define the strategy of how (rather when) the database will be created using our model (if you are a bit lost then I recommend to watch Code First Development with Entity Framework). These implementations implement an Interface called IDatabaseInitializer. There are some pre-backed strategies for database creation are provided with the EF. The one we are interested here is DropCreateDatabaseAlways. The name says it all, when using this strategy a database will be created every time code is executed(unit tests are run). To use the Initializers all you need to do is to write the following code before your unit tests are executed

System.Data.Entity.Database.SetInitializer<MyDbContext>(new DropCreateDatabaseAlways<MyDbContext>());

(Make sure you are using a local database server(SQL Express), things can really stressed if you are using a shared development server Smile.)

So now our database is created every time unit tests are run, now we need to create some data each time the database is created. For this you can inherit the  DropCreateDatabaseAlways class and override a method called Seed. using this method you add the initial records you want to create every time database is created.

Here is how the code will look like

public class MyUnitTestDatabaseInitializer : DropCreateDatabaseAlways<MyDbContext>
   {
       protected override void Seed(WilkeDbContext context)
       {
  //you can add any data using the context objected provided in the parameter

           context.SaveChanges();
           base.Seed(context);
       }
   }
The seed method is called just after the database is created by the the framework. Now all you need to do it to tell the framework to use your Database Initializer, and that you will do by modifying the code I wrote above.

System.Data.Entity.Database.SetInitializer<MyDbContext>(new MyUnitTestDatabaseInitializer <MyDbContext>());

so here you have it, now you can write the unit tests which are totally predictable and can run on any environment, be it your local machine or your Build Server. 

August 11, 2011

My name is Ovais and ICodeInCSharp , IDoPhotography,IWriteBlogs

Filed under: C# Coding,General Software Archiecture,Modeling — ovaisakhter @ 2:11 pm

 

I am a big fan of interfaces in a object oriented language like C#. I have been using dependency injection (IoC) in most of my previous projects and using DI your model contains many interfaces. I always try to spend some time in naming my model elements be it the name of interfaces, classes, methods or even the variables. When I look at code by other people one thing I usually notice is how they are naming their model elements.

Some days back I was looking into NServiceBus the things I noticed while doing that other than shear brilliance was the way different interfaces were named. for example IHandleMessages or IContainSagaData. This was a bit different way of naming then what I usually use as normally I would use something like IMessageHandler. I thought this is an interesting way of naming something like a class is saving my Name is MessageHandler and I Handle Messages. Sounds more natural more understandable.

I thought I should give this way of naming a try. In one of my projects, I tried to use similar naming.

The first place I need to create Interfaces are the repositories. So I started with names like IStoreData (instead of IRepository sounds boring doesn’t it?), IStoreDataInSql and went on and making names like IStoreStatistics.

Next place was the application controller where normally the interfaces for the services are named as IStatisticsService or IConfigurationService etc the names now became a bit different i.e. ILogStatistics, IManageSiteSurveryConfiguration, INotifyToExecuteAction INotifyWhenSiteSurveyConfigurationChanges.

Thought it was fun way of naming interfaces in some cases it become as stagnant as the old approach like IProvideThis and IHandleThat but in some cases the names become very interesting like INotifyWhenSiteSurveyConfigurationChanges and INotifyToExecuteAction.

August 10, 2011

JIT implementation for PivotViewer featuring Flickr.Net

Filed under: Silverlight — ovaisakhter @ 2:01 pm

I saw a my first demo of PivotViewer by a friend of mine some months back I was very impressed at that time, but then I forgot all about it. Then I saw it again some days back and thought to explore it a bit. As a developer I was not interested in the Excel tools provided by Microsoft. Having interest in photography the first thought that came to my mind was that a Flickr implementation of the PivotViewer could be very exciting. I went on tried googling a Flickr implantation for PivotViewer so that I can download the code run it and call it a day. But to my surprise I was not able to find one. So I thought to try to make one for myself. 

So the question was where to start I knew that there is a very nice library available for accessing Flickr service. The Library can be found here

http://flickrnet.codeplex.com/

and now the part was how can I provide the information to PivotViewer at run time. After a little search I was able to get this information from the following location

http://www.silverlight.net/learn/data-networking/pivot-viewer/just-in-time-sample-code

The sample code comes with prebuilt providers, I really loved the one for Twitter it let you visualize your own twitter feed in a way that you have never imagined, that was the point I became more interested in PivotViewer.

image

so you can filter your twitter feed using the @Mentions that you have done in your tweets and the #hashtags you have used. It kind of gives you a full overview of your thought pattern over a period of time in a very user friendly way.

When you open the solution in the Visual Studio you notice that there are projects in the solution.

PivotServer: This is the web application used to host the Pivot JIT implementation

PivotServerTools: As the name suggests contains all the important Data Objects and tools used in the implementation

SilverlightPivotViewer: The Silverlight project used to host the Pivot.

CollectionFactories: This is the Project we are most concerned with, in the project we have the implementations for different data sources. This is where we will add our very own Flickr implementation.

The Flickr Factory

To start the implementation of a JIT provider first thing you need to do it a bit of thinking. i.e. What are the different dimensions you will like to view your information. These are the dimensions using which you will be able to filter and sort your items. Flickr provides a concept of “Set” where each photo can belong to a set. So I chose the sets to be one of the Dimensions. The other two dimensions are I choose are Date Uploaded and Date Taken. I would have loved to use some of the the exif information as Dimensions but there was not way to get all the pictures and their exif information in one call.

Next thing you will need is an API key from Flickr you can get it from http://www.flickr.com/services/ after logging in.

So now the planning is done. We are ready to make our provider.

First Add a class to the CollectionFactories project and inherit it from CollectionFactoryBase

public class FlickrCollection : CollectionFactoryBase

public   FlickrCollection()
       {
           this.Name = "FlickrPhotos";

           this.SampleQueries = new string[]{
               "flickrUserEmail=ovaisbutt@yahoo.com",
               "flickrUserEmail=daredeagle@yahoo.com",
               "flickrUserEmail=ssatif@yahoo.com"
              
           };
       }

In the constructor you will set the name of your collection and also some of the sample queries that your application can handle. This information will be used to generate the UI by the framework which we will see later on.

To provide the Collection Information you need to override a method of this class i.e. MakeCollection

Now let us get some data from Flickr

First we need to get a Flickr user by the email provided.

FoundUser foundUser;
            var flickr = new Flickr(ApiKey);

            try
            {
                foundUser = flickr.PeopleFindByEmail(emailofUser);
            }
            catch(Exception exception)
            {
                return ErrorCollection.FromException(exception);
            }

flickrCollection.Name = string.Format("photos of {0}",foundUser.FullName);

then we get all the sets of the user

var allsets = flickr.PhotosetsGetList(foundUser.UserId);

now we can iterate through all the sets get picture information in them and start creating the Collection Items

var flickrCollection = new Collection();

foreach (var set in allsets)
            {

                var allPhotosInSet = flickr.PhotosetsGetPhotos(set.PhotosetId,PhotoSearchExtras.DateTaken | PhotoSearchExtras.DateUploaded );               

                foreach (var photo in allPhotosInSet)
                {
                    if (allPhotos.Contains(photo.PhotoId))
                    {
                        continue;
                    }

                    allPhotos.Add(photo.PhotoId);

                    var facetsForFlickr = new List<Facet>
                                              {
                                                  new Facet("Set", set.Title),
                                                  new Facet("Title",photo.Title),
                                                  new Facet("Photo Description",photo.Description),
                                                  new Facet("WebLink",new FacetHyperlink("Link",photo.WebUrl)),
                                                  new Facet("Date Taken", photo.DateTaken),
                                                  new Facet("Date Uploaded", photo.DateUploaded)
                                             };

                  

                    var url = photo.DoesLargeExist ? photo.LargeUrl : photo.SmallUrl;

                    flickrCollection.AddItem(photo.Title, null, photo.Description, new ItemImage(new Uri(url)), facetsForFlickr.ToArray());                    
                    
                }           

            }

the most important thing in this code is the List of Facets this the meta information you will provide about the picture. Once all the items are added to the collection now we need to tell how we want out meta information to be used and displayed.

            flickrCollection.SetFacetDisplay("Set", true, true, false);
            flickrCollection.SetFacetDisplay("Date Taken", true, true, false);
            flickrCollection.SetFacetDisplay("Date Uploaded", true, true, false);
            flickrCollection.SetFacetDisplay("Title", false, true, false);
            flickrCollection.SetFacetDisplay("Photo Description", false, true, false);

 

and finally return the collection.

Run the application and navigate to the following page silverlightPivotViewerTestPage.aspx page and you will see your collection shown in the dropdown

image

 

select one of the options and the Pivot will be rendered for that user (If you have many pictures it may take a while).

Here are some views from the result.

image

 

image

 

image

 

you can see this in action at www.wisdom.pk

August 5, 2011

Injecting Timer into Services

Filed under: Dependency Injection,Modeling — ovaisakhter @ 5:03 am

In the recent days while developing a project I had two services in my ApplicationController which were utilizing a Timer, i.e. they were instantiated with the Application and a timer was started when the services were instantiated and the services performed a certain action on the elapsed event.

I am using Dependency Injection in this project using Unity Container by Microsoft. I use Constructor Injection and define all dependencies in a BootStrapper class hosted present in my Application.

In my refactoring thoughts I came to a conclusion that the timer should be injected from the outside to the services as a Dependency. So I started modeling a Timer. Names like ITmer or ITimerService or even IKeepTime (recently I have been inspired by Udi Dahan’s way of naming interfaces like IDoThis and IHandleThat, fascinating stuff but more on that later). So ITimer was the winner in the end.

I started contemplating well it is merely a wrapper around the normal Timer class so it should just provide the features timer class provides. Then I thought but why should my service have the capability or even knowledge of the interval in which the Timer will elapse it only needs to know when to start the timer when to stop it and when it had elapsed. The Stop feature is only relevant so that the utilizing service can ask the timer to hold on while it is performing a action if it wants to Do one thing at a time (which is generally a good thing we should tell our Classes to do this also) Some may argue that why does Timer needs to know this it should just keep on banging on the door and our service should ignore any such notifications, I agree that this can be a way of thinking but read on I got it cleared for me later. Here is the final interface I came up with,

public  interface ITimer
  {
      void Start();
      void Stop();
      event EventHandler Elapsed;
  }

Now I am all happy made the implementation of the service all set for tomorrow when I will refactor my services to use my newly created model and called it a day. But then I started thinking about it again when I came home (Hope my wife doesn’t read my blog, but this blog is not about a new Chicken Biriayni  recipe so we are all set). I thought that for the services utilizing the Timer do not care that there is a timer running in the Timer service, for them this is a service that notifies them when the they should execute an action, and timer is just a concrete implementation of such a service they can even be a code listening to some event or even a UI sending a click of a button in which case being able to tell the service generating the notifications that the host Service is not entertaining any notifications at the moment may make sense. (may be disable a button during execution). Now let us call our newly thought of service INotifyToExecuteAction and let us change the names of the members to go with the new theme.

public  interface INotifyToExecuteAction
{
    void StartNotifications();
    void StopNotifications();
    event EventHandler Execute;
}

(by the way did I tell you that the renaming feature in Resharper is freaking awesome Ctrl R R on the method names of the interface rename them and it will change the method names in all the implementations also. How kool is that?)

so here we are then

July 1, 2011

Generating Weighted Random

Filed under: C# Coding — ovaisakhter @ 2:02 pm

Have you ever been given a requirement by the Business people something like, we want 5% of the visitors of the site to look at the participate in the survey. The first thought that comes to your mind (At least it came in my mind) that we will need to maintain a data store where will maintain count of all the visitors. We will show the popup to every 5th visitor or something like that. When your code is deployed into load balanced  environment then this data store will be some database and then things begin to become hairy you need to deal with problems like locking and concurrency, and most importantly performance, and interestingly you will never be able to get the exact percentage without seriously hurting the performance.

The situation will also become interesting when the requirement will become like 10% will see survey 1, 20% will see survey number 2 and rest will not see any survey.

Some while back I was facing the same problem a friend of mine(Mads Voigt Hingelberg) suggested me a great way out.

It is very simple if you think about it.

Take the example I described above, when a user comes generate a random number from 0 to 100 if this number is from 0 to 10 show him the survey 1, if it is from 10 to 30 show him the survey 2 and if the number is above 30 do not show him anything. This approach may not give you 100% accurate results (which you will not be getting that anyway) but if you take a large enough sample you will see the values are pretty close to what you are looking for.

I wrote a simple class to handle this problem generically.

public class WeightedRandom<T> {
 private readonly List<Range<T>> _cases = new List<Range<T>>(); private static readonly object LockObject = new object();
 private readonly Random _rand = new Random();
  public WeightedRandom(IEnumerable<WeightedRandomDto<T>> cases) { if (cases.Sum(x=>x.Percentage) != 100)
 { throw new ApplicationException("The total of the percentages should be exactly 100");
 }
 var tmpCases = cases.OrderBy(a => a.Percentage);
 var from = 0;
 foreach (var weightedRandomEntry in tmpCases) {
 var range = new Range<T> {Object = weightedRandomEntry.Object, To = from + weightedRandomEntry.Percentage};
 if (from != 0) {
 range.From = from + 1; }
 from = range.To ; _cases.Add(range);
 }
 }
 public T GetNext() {
 lock (LockObject) {
 var random = GetRandom();
 foreach (var range in _cases)
 {
 if (range.From <= random && random <= range.To) {
 return range.Object; }
 }
 return default(T); }
 }   private struct Range<TP>
 { public TP Object { get; set; }
 public int From { get; set; } public int To { get; set; }
  }
 private int GetRandom() {
 var ret = _rand.Next(0,100); return ret;
 } } the constructor takes a collection of
 public struct WeightedRandomDto<T>
    {
        public WeightedRandomDto(int percentage, T obj)
            : this()
        {
            Percentage = percentage;
            Object = obj;
        }
 
        public int Percentage { get; private set; }
        public T Object { get; private set; }
    }

Here is the code that you can use to use this

var list = new List<WeightedRandomDto<string>>
 { new WeightedRandomDto<string>(10, "case 1"),  new WeightedRandomDto<string>(30, "case 2"), new WeightedRandomDto<string>(30, "case 4"), new WeightedRandomDto<string>(30, "case 5") }; _weightedRandom = new WeightedRandom<string>(list.ToArray()); var ret = _weightedRandom.GetNext();

Have fun and do remember to share your feedback.

Blog at WordPress.com.