Thursday, November 29, 2012

Control Entity Framework, do not let it control you


In the company I am working they are currently facing really heavy problems with an application that (miss)uses Entity Framework (EF).
I have not yet worked with EF in my own projects (so take this post not too serious) but did some hours of investigation how it probably should be used in a "real world" application (I mean a mid-sized or even big data centric business application in contrast to the "hello world" style tutorials you usually see in the internet where DbContexts are within controller actions).
First of all, what are the main features / benefits that ship with EF:

  • Build-in mapping functionality and relationship management (foreign keys)
  • Automatic generation of CRUD SQL statements
  • No need to define SQL parameters manually (increased security due to reduced risk of SQL injections)
  • Automatic data migrations (managed by Nuget Package Manager) can replace non-integrated data migrations
  • Data access layer validation (data annotations)
  • Concurrency handling (with timestamps)
  • Support of all major RDBMS
  • Quick (re-)generation of databases (possible use case: "quickly" creating testing databases within nightly builds)
  • Precompilation of queries (execution plans), but only from Version 5 on

In enterprise scenarios, you are usually dealing with existing databases. Here the question comes up how you would apply EF to an existing database.
There are two possibilites: if you prefer working with Code instead of designer tools (my guess is that most developers do) you would use reverse engineer tools (e.g. EF powertools) to define code classes and mappings for your existing database. Alternative to this code centric way is the designer centric way ("database first") where you would reverse engineer an .edmx model (classes and mappings are auto-generated from .edmx).

Now, what should be taken into consideration when working with EF (most of the hints are from a TechEd 2012 session by Adam Tuliper) ?
  • DBContext is not thread safe, instantiate a new one per request (best via DI)
  • Do not cache it or use a static instance.
  • Dispose DBContext when done (DI does that automatically for you)
  • Utilize repository pattern, make EF your repository implementation
  • No EF code anywhere else than in your repository implementation (e.g. not as view models) - no references to EF from other layers than data access
  • Return data grabbed data with .ToArray() / .ToList(). Reason: EF uses deferred execution and you usually want to have control over when a database query is being performed (note that deferred execution outside the DBContext scope will lead to "DBContext already disposed" errors). By calling .ToArray() or .ToList() you are forcing an immediate execution.
  • Always check EF generated Sql statements (e.g. MiniProfiler is a convenient possibility) - replace them by telling EF to use custom stored procedures in non trivial scenarios
  • Performance was improved in Version 5 (see above), but be aware that EF is still slower compared to „raw“ ADO.NET access (SqlDataReader etc). Consider using a more lightweight ORM (e.g. dapper) if winning some milliseconds per query is crucial for your application.
  • Keep controlling the loading process, avoid lazy loading when it is not necessary
  • EF does have out-of –the-box support for “nolock”, you have to use Transactions with READ UNCOMMITTED (or call stored procedures)
Let me know if you think that other things are also important when using EF beyond “hello, world”. Btw: most of the mentioned points are not only applying to EF but to every ORM.

Thursday, November 22, 2012

ASP.NET Web API

ASP.NET Web API is Microsoft's platform for RESTful services that shipped with Visual Studio 2012 and .NET Framework 4.5 in autumn 2012.
I was searching the web some months earlier looking for Microsofts RESTful service support and got quite confused. They took different approaches in the last years (in combination with renaming the technologies):
- WCF Web Http
- WCF Rest Starter Kit
- WCF Web Api

Their latest (and hopefully final for the next years) approach was to take the best out of the two worlds ASP.NET MVC and WCF Web Api and gave it the name "ASP.NET Web API". Note that this is not part of WCF any more, it ships with ASP.NET MVC 4 (as open source).

Elements that were taken from ASP.NET MVC  are:
- Routing
- Model Binding
- Validation
- IoC support
- Filters
- Link generation
- Testability
- VS template + scaffolding

The following was taken from WCF Web Api:
- Up-to-date programming model
- HttpClient, HttpServer
- Async support (=> performance and scalability)
- Formatting
- Content negotiation
- Service descriptions (in form of help pages)
- Self hosting

ASP.NET Web API functionality will look very familiar to developers that had to do with ASP.NET MVC in the past. All of  the concepts have been reused, some were fine tuned where service specific functionality made it necessary.

Here is an interesting introduction.

You should also use it for your also for your client side callbacks rather than only for "real" web services.
Reason for this is that ASP.NET Web API provides:





Thursday, November 15, 2012

Private bytes, virtual bytes, working set

The process category in performance monitor (perfmon.exe) contains (amongst others) 3 memory usage counters:
- private bytes
- working set
- virtual bytes

A lot of explanations are available on the web - for a quick overview I created the following figure illustrating the differences:




Wednesday, November 14, 2012

Abstractness vs. Instability

We recently introduced NDepend static code analysis within some of our projects.
The report the tool is generating also contains a diagram called "Abstractness versus Instability" which perhaps needs some explanation.
But first, here is an example how it could look like:

 

Already in 1994, Robert C.Martin (author of "Clean code") wrote an article about a set of metrics that can be used to measure the quality of an object-oriented design in terms of the interdependence
between the subsystems of that design.

In general, dependencies should be avoided, but creating software systems completely without any dependencies is neither desirable nor useful.
What is important is the type of dependencies a component is depending upon - Martin distinguishes  between "bad" and "good" dependencies. Good dependencies are dependencies on "stable" components.
But what does a component qualify as being "stable"?
This can be expressed with a formula (see article mentioned above) that says:
a) stable components hardly depend on other components
b) many other components depend on them

The consequence of a) is that these components have no reason to change.
The consequence of b) is that there are a lot of reasons not to change these components.

Guess what "having no reason to change" and "lots of reasons not to change" in turn means? Well, these components simply won't change very often. Therefore they are considered as being stable.

Stable components could be imagined as the bricks of the bottom floors of a skyscraper. Imagine what has to be done if such bricks have to be replaced? All bricks on top of them have to be disassembled - for sure a lot of work to do.

The main sequence line in the above diagram shows the how abstractness and instability should be balanced. A stable component would be positioned on the left. If you check the main sequence you can see that such a component should be very abstract to be near the desirable line - on the other hand, if it's degree of abstraction is low, it is positioned in an area that is called the "zone of pain".
Why is this called "zone of pain"?
As mentioned above, these components have a lot of clients. This is a bad constellation together with the characteristic of low abstraction because "implementations" (i.e. low abstractions) tend to change frequently which means that (lots of) clients also have to be changed quite frequently. I guess most of us already faced components from the zone of pain ;-)

Now some words to instable components. Instable components are ones that depend on a considerable high number of other components.
Back to our comparison with skyscrapers: instable components are the bricks on the top floors. They are depending on bricks of lower floors (the stable ones).
Strive to reduce the number of components that depend on your instable components - only like this, they can be changed easiely without having to perform numerous changes within referencing components.

Tuesday, November 13, 2012

Key lookups in SQL Server

Today I have been tuning a stored procedure in SQL Server 2005 where the execution plan showed a key lookup.

Let me first answer the question what a key lookup (aka Clustered Index Seek - before SQLServer 2005 R2) is:

If you compare a database table with a book, then the table of contents is the clustered index and the index at the end of the book is a nonclustered index. Note that the latter does not only consist of keywords, but also has references to the page numbers (without them the index would be quite useless).
A nonclustered index is very similar: it consists of the values of the chosen columns plus a reference to the entire row in the table.
A key lookup is the equivalent of the process of selecting the corresponding page to a keyword, i.e. if an index does not contain all columns a select statement needs a second step has to be performed which selects the entire row to be able to serve the missing columns.

Usually, a key lookup within an query execution plan is a sign that the query could (should) be optimized (at least if a considerable amount of rows have to be looked up).

Getting rid of key lookups is usually easy: "Covering indexes" (indexes with all the selected columns) or included column indexes will do the trick.

Monday, November 12, 2012

How AppPools, worker processes and AppDomains are related

In IIS, you are able to group several applications to a common application pool (AppPool).
Every AppPool can be assigned one (that's the default) or multiple (then this is called "web garden") worker processes.
A worker process (in case of IIS the name of the executable is w3wp.exe) is a memory region that is isolated from other worker processes. Like every windows process, it is being assigned a unique ID that can be checked e.g. in task manager.
Each of the applications assigned to the AppPool will live in a separate application domain (AppDomain) within each worker process.
An AppDomain is a submemory region isolated from other appdomains within the same process.
It stores Application, Cache, and in-process Session objects of the application and can be recycled e.g. by touching web.config .
The following figure illustrates the relationships:


Friday, November 9, 2012

Hello world

My blog will contain short articles about Information Technology related topics - due to the fact that I'm working as a .NET software developer most of the posts will be somehow Microsoft related.
I am not sure if this will save the world or be too useful for other readers. I think a lot the majority of the posts will be repeat what other web sites, books or other resources are already containing - just summarized and in my own words.
So what is this good for? Well, I think that if you dealt with a subject, writing something down is one of the best ways not to forget about it some weeks (or days, hours or minutes) later. As you see, the blog is actually meant for myself. But perhaps from time to time someone stumbles over one of the posts and yet finds the content useful...