Sau khi tham khảo nhiều nguồn trên internet và thông qua các lần thảo luận với đồng nghiệp, Tôi có test thử các kỹ thuật để test tối ưu hóa trong các câu lệnh trong LinQ.
Đây là link gốc trên trang Sidar OK
(Thanks Sidar OK site)
Các bạn sẽ sử dụng Petshop để Test lại.
http://www.mediafire.com/?19pyp63h5gzc49x, các bạn load về và Attach PetShop.mdf, chú ý là Tôi sử dụng SQL Server 2008
Ở đây Tôi sao chép lại 10 tips trên trang Sidar Ok. Tôi có thay bằng PetshopDatacontext.
Tôi nghĩ cái này hay và hữu dụng nên đã sao chép nguyên bản gốc vào đây. Các bạn có thể làm lại để kiểm chứng kết quả:
=================================================================
1 – Turn off ObjectTrackingEnabled Property of Data Context If Not Necessary
If you are trying only to retrieve data as read only, and not modifying anything, you don’t need object tracking. So turn it off using it like in the example below:
using (PetshopDataContext context = new PetshopDataContext ())
{
context.ObjectTrackingEnabled = false;
}
This will allow you to turn off the unnecessary identity management of the objects – hence Data Context will not have to store them because it will be sure that there will be no change statements to generate.
2 – Do NOT Dump All Your DB Objects into One Single DataContext
DataContext represents a single unit of work, not all your database. If you have several database objects that are not connected, or they are not used at all (log tables, objects used by batch processes,etc..). These objects just unnecessarily consume space in the memory hence increasing the identity management and object tracking costs in CUD engine of the DataContext.
Instead think of separating your workspace into several DataContexts where each one represents a single unit of work associated with it. You can still configure them to use the same connection via its constructors to not to loose the benefit of connection pooling.
3 – Use CompiledQuery Wherever Needed
When creating and executing your query, there are several steps for generating the appropriate SQL from the expression, just to name some important of them:
- Create expression tree
- Convert it to SQL
- Run the query
- Retrieve the data
- Convert it to the objects
As you may notice, when you are using the same query over and over, hence first and second steps are just wasting time. This is where this tiny class in System.Data.Linq namespace achieves a lot. With CompiledQuery, you compile your query once and store it somewhere for later usage. This is achieved by static CompiledQuery.Compile method.
Below is a Code Snippet for an example usage:
Func<PetshopDataContext , IEnumerable<Category>> func =
CompiledQuery.Compile<PetshopDataContext , IEnumerable<Category>>
((PetshopDataContext context) => context.Categories.
Where<Category>(cat => cat.Products.Count > 5));
And now, “func” is my compiled query. It will only be compiled once when it is first run. We can now store it in a static utility class as follows :
/// <summary>
/// Utility class to store compiled queries
/// </summary>
public static class QueriesUtility
{
/// <summary>
/// Gets the query that returns categories with more than five products.
/// </summary>
/// <value>The query containing categories with more than five products.</value>
public static Func<PetshopDataContext , IEnumerable<Category>>
GetCategoriesWithMoreThanFiveProducts
{
get
{
Func<PetshopDataContext , IEnumerable<Category>> func =
CompiledQuery.Compile<PetshopDataContext , IEnumerable<Category>>
((PetshopDataContext context) => context.Categories.
Where<Category>(cat => cat.Products.Count > 5));
return func;
}
}
}
And we can use this compiled query (since it is now a nothing but a strongly typed function for us) very easily as follows:
using (PetshopDataContext context = new PetshopDataContext ())
{
QueriesUtility.GetCategoriesWithMoreThanFiveProducts(context);
}
Storing and using it in this way also reduces the cost of doing a virtual call that’s done each time you access the collection – actually it is decreased to 1 call. If you don’t call the query don’t worry about compilation too, since it will be compiled whenever the query is first executed.
4 – Filter Data Down to What You Need Using DataLoadOptions.AssociateWith
When we retrieve data with Load or LoadWith we are assuming that we want to retrieve all the associated data those are bound with the primary key (and object id). But in most cases we likely need additional filtering to this. Here is where DataLoadOptions.AssociateWith generic method comes very handy. This method takes the criteria to load the data as a parameter and applies it to the query – so you get only the data that you need.
The following code below associates and retrieves the categories only with continuing products:
using (PetshopDataContext context = new PetshopDataContext ())
{
DataLoadOptions options = new DataLoadOptions();
options.AssociateWith<Category>(cat=> cat.Products.Where<Product>(prod => !prod.Discontinued));
context.LoadOptions = options;
}
5 – Turn Optimistic Concurrency Off Unless You Need It
LINQ to SQL comes with out of the box Optimistic Concurrency support with SQL timestamp columns which are mapped to Binary type. You can turn this feature on and off in both mapping file and attributes for the properties. If your application can afford running on “last update wins” basis, then doing an extra update check is just a waste.
UpdateCheck.Never is used to turn optimistic concurrency off in LINQ to SQL.
Here is an example of turning optimistic concurrency off implemented as attribute level mapping:
[Column(Storage=“_Description”, DbType=“NText”,
UpdateCheck=UpdateCheck.Never)]
public string Description
{
get
{
return this._Description;
}
set
{
if ((this._Description != value))
{
this.OnDescriptionChanging(value);
this.SendPropertyChanging();
this._Description = value;
this.SendPropertyChanged(“Description”);
this.OnDescriptionChanged();
}
}
}
6 – Constantly Monitor Queries Generated by the DataContext and Analyze the Data You Retrieve
As your query is generated on the fly, there is this possibility that you may not be aware of additional columns or extra data that is retrieved behind the scenes. Use Data Context’s Log property to be able to see what SQL are being run by the Data Context. An example is as follows:
using (PetshopDataContext context = new PetshopDataContext ())
{
context.Log = Console.Out;
}
Using this snippet while debugging you can see the generated SQL statements in the Output Window in Visual Studio and spot performance leaks by analyzing them. Don’t forget to comment that line out for production systems as it may create a bit of an overhead. (Wouldn’t it be great if this was configurable in the config file?)
To see your DLINQ expressions in a SQL statement manner one can use SQL Query Visualizer which needs to be installed separately from Visual Studio 2008.
7 – Avoid Unnecessary Attaches to Tables in the Context
Since Object Tracking is a great mechanism, nothing comes for free. When you Attach an object to your context, you mean that this object was disconnected for a while and now you now want to get it back in the game. DataContext then marks it as an object that potentially will change – and this is just fine when you really intent to do that.
But there might be some circumstances that aren’t very obvious, and may lead you to attach objects that arent changed. One of such cases is doing an AttachAll for collections and not checking if the object is changed or not. For a better performance, you should check that if you are attaching ONLY the objects in the collection those are changed.
I will provide a sample code for this soon.
8 – Be Careful of Entity Identity Management Overhead
During working with a non-read only context, the objects are still being tracked – so be aware that non intuitive scenarios this can cause while you proceed. Consider the following DLINQ code:
using (PetshopDataContext context = new PetshopDataContext ())
{
var a = from c in context.Categories
select c;
}
Very plain, basic DLINQ isn’t it? That’s true; there doesn’t seem any bad thing in the above code. Now let’s see the code below:
using (PetshopDataContext context = new PetshopDataContext ())
{
var a = from c in context.Categories
select new Category
{
CategoryID = c.CategoryID,
CategoryName = c.CategoryName,
Description = c.Description
};
}
The intuition is to expect that the second query will work slower than the first one, which is WRONG. It is actually much faster than the first one.
The reason for this is in the first query, for each row the objects need to be stored, since there is a possibility that you still can change them. But in the 2nd one, you are throwing that object away and creating a new one, which is more efficient.
9 – Retrieve Only the Number of Records You Need
When you are binding to a data grid, and doing paging – consider the easy to use methods that LINQ to SQL provides. These are mainly Takeand Skip methods. The code snippet involves a method which retrieves enough products for a ListView with paging enabled:
/// <summary>
/// Gets the products page by page.
/// </summary>
/// <param name=”startingPageIndex”>Index of the starting page.</param>
/// <param name=”pageSize”>Size of the page.</param>
/// <returns>The list of products in the specified page</returns>
private IList<Product> GetProducts(int startingPageIndex, int pageSize)
{
using (PetshopDataContext context = new PetshopDataContext ())
{
return context.Products
.Take<Product>(pageSize)
.Skip<Product>(startingPageIndex * pageSize)
.ToList<Product>();
}
}
10 – Don’t Misuse CompiledQuery
=================================================================