It's about the use of lookup-data in WPF dialogs. For a dialog with person entities these are, for example, values for Salutations, Nationalities... or if addresses are assigned Countries, Regions etc.

In order that the initial delay of the dialog is short as possible, all database activities for the initialization shall not take more than 0.1 to 0.2 seconds.

For the quickly access of the rarely changed values, at program start a DisconnectedState is already filled with the values.

Execute( ge => ... ).Run(); 
...

In the actual MDI WPF dialog the cached values are merged to the special session (ClientProfile) with Session.DisconnectedState.Merge(App.DSPersonDlgCache). Unfortunately, this process already takes about 0.3 seconds.

Now the lookup-data are being provided via

List1 = Session.Query.All<T1>()...ToList(); 
List2 = Session.Query.All<T2>()...ToList(); 
... 
Listn = Session.Query.All<Tn>()...ToList();


Control1.ItemsSource = List1; 
Control2.ItemsSource = List2; 
... 
Controln.ItemsSource = Listn;

to the different controls.



1.

Despite practically cached data in the DisconnectedState an unwanted DB access is generated for each query.

E.g. "SELECT [a].[Id], 110 AS [TypeId], [a].[Code2], [a].[Code3], [a].[CodeInt], [a].[PhoneCode], [a].[CarCode] FROM [dbo].[Country] [a];"

How can the DB access be avoided?

MS SQL Express - 0.2 seconds   
MS Azure       - 1.2 seconds!!!

2.

The use of Session.Query.ExecuteDelayed( qe => ... ) needs in fact only one DB access altogether, but unfortunately can not be used with .Prefetch(...) (error message).

MS SQL Express - 0.1 seconds   
MS Azure       - 0.7 seconds!!!

3.

When using the query List1 = Session.DisconnectedState.All<t1>()...ToList(); there is no more DB access, but the performance is very slow!

MS SQL Express - 5.4 seconds!!!   
MS Azure       - 5.4 seconds!!!

If the lookup-data are read in a separate transaction, the execution is strangely much faster, but still too slow!

using( Session.OpenTransaction() ) 
{ 
  List1 = Session.DisconnectedState.All<T1>()...ToList(); 
  List2 = Session.DisconnectedState.All<T2>()...ToList(); 
  ... 
  Listn = Session.DisconnectedState.All<Tn>()...ToList(); 
}


MS SQL Express - 1.7 seconds!!!   
MS Azure       - 1.7 seconds!!!

What further options are there in DO 4.4 (O2O mapping would only be the last resort) to increase the speed? The planned Caching API in DO 4.5 would be a great help.


4. Session.SaveChanges()

After the modifcation of only one person, for example LastName "Maier -> Meier", SaveChanges() also re-read unwanted all lookup-data from the DB.

MS SQL Express - 35 seconds!!!   
MS Azure       - 70 seconds!!!

Right now we use a workaround with Operations.Replay(...). How can the re-reading be prevented?


Kind Regards

asked Feb 08 '11 at 05:01

TeaMan's gravatar image

TeaMan
140141418

edited Feb 08 '11 at 05:23


3 Answers:

Your tips have helped a lot.

The best results so far:

MS SQL Express - 0.1 seconds   
MS Azure       - 0.7 seconds!!!

Now:

MS SQL Express - 0.015 seconds!!!   
MS Azure       - 0.015 seconds!!!

Thank you!

answered Feb 14 '11 at 06:09

TeaMan's gravatar image

TeaMan
140141418

Thank you for your detailed answer! This brings some light into the dark.

On 4:

  • With these hints, the problem is solved!

On 3:

  • In the example there are only 600 cached 'main' lookup entities. But in total 19000 with the assigned localization entities.
  • Perhaps here are some optimizations of DO possible?

So what we really need are Session independent cached query results (without O2O mapping).

In our program the user can start several independent Person MDI WPF dialogs. Each have it's own Session and DisconnectedState.

But the lookup entities are always the same! To avoid waste of time, the data should be global cached.

The same need for instance with global cached program configuration data based on DO entities.

At the moment easy caching lookup entities to global lists does not work because of Session binding (Exception: "Enitiy 'Country,(52)' is bound to another Session").

Kind Regards

answered Feb 09 '11 at 13:36

TeaMan's gravatar image

TeaMan
140141418

edited Feb 09 '11 at 13:40

On 3: since DS.All processes all the entities there, you deal with 19K entities. How many times do you enumerate them? The provided example builds N lists, so what is the actual value of N?

Btw, the easiest way to speedup this is to enumerate the entities once and split them by types first, and use this data to build all the lists further:

(Feb 09 '11 at 14:13) Alex Yakunin Alex%20Yakunin's gravatar image

Example code:

var entitiesByType = (
  from e in ds.All()
  group e by e.GetType() into g
  select {k = g.Key, v = g.ToList()}
  ).ToDictionary(p => p.v, p => p.v);

var list1 = (
  from p in entitiesByType
  where typeof(T1).IsAssignableFrom(p.Key)
  from e in p.Value
  select e
  ).ToList();
(Feb 09 '11 at 14:13) Alex Yakunin Alex%20Yakunin's gravatar image

So DS.All<T> method currently does not implement an optimization like this - mainly, because we never need such a map in any of DS operations.

(Feb 09 '11 at 14:15) Alex Yakunin Alex%20Yakunin's gravatar image

Perhaps here are some optimizations of DO possible?

I'd like to know the actual N to make the decision. Numbers like ~ 100K entity fetch operations per second are acceptable on the client side, but much smaller ones means we must really profile this.

Note that internally DS operates like a cache between Session and DB, so DO tries to fetch the state of each accessed entity from this cache in each new outermost transaction running when DS is attached.

(Feb 09 '11 at 14:19) Alex Yakunin Alex%20Yakunin's gravatar image

At the moment easy caching lookup entities to global lists does not work because of Session binding (Exception: "Enitiy 'Country,(52)' is bound to another Session").

Have you understood the idea I just described - I mean keeping a single DS with shared static data + using DisconnectedState.Merge to merge this data into any other DS you need. If this is unclear, I'm ready to help.

(Feb 09 '11 at 14:23) Alex Yakunin Alex%20Yakunin's gravatar image

Short example:

// At app. start:
StaticContent = BuildStaticContent();

// At some refresh method:
// 1. drop old client-side session 
//    (i.e. the one with attached DS)
// 2. create new client-side session

session.DisconnectedState.Merge(
  StaticContent.DisconnectedState);

var genders = 
  StaticContent.GenderKeys.Select( 
    key => session.Query.Single<Gender>(key));

// This can be extracted to 
// StaticContent.GetGenders(session) method
...
(Feb 09 '11 at 14:31) Alex Yakunin Alex%20Yakunin's gravatar image
  • n = 9
  • Where you described the idea with StaticContent in detail?
(Feb 10 '11 at 03:28) TeaMan TeaMan's gravatar image

n = 9

If so, I'd consider there are no strong requirements for any optimization: DO populates about 100K entities per second in your case, that's nearly the same speed as you might expect with DB (peak is ~ 300K entities/second, but that's on small entities). So DS operation speed isn't the issue.

(Feb 13 '11 at 22:54) Alex Yakunin Alex%20Yakunin's gravatar image

Where you described the idea with StaticContent in detail?

Unfortunately, it isn't described in our Manual or blog. But the idea is pretty simple: since DS has Merge method, you can pupulate a DS with static content (+ collections storing keys or Ref<T> objects describing content of your dictionaries, since DS caches just entities, but not query results), and use this DS to populate the content of any other DS you're going to use in UI.

(Feb 13 '11 at 22:54) Alex Yakunin Alex%20Yakunin's gravatar image

On 1:

DisconnectedState does not cache query results, it caches fetched entity states. So any query (except Session.Query.Single method group and prefetches) definitely hits the database.

To avoid this, you must either cache query results by your own (e.g. by caching lists), or use LINQ to Enumerable over DisconnectedState content. See DisconnectedState.All() method (and similar ones).


On 2:

.Prefetch in DO works differently then you think - the difference is that this API is based on IEnumerable<T>, but not IQueryable<T>. Basically, DO studies what really absents in result first, and fetches this in the efficient way next. Or, more precisely, DO wraps and enumerable passed to .Prefetch with another one that will subsequently prefetch all the missing parts (i.e. .Prefetch will continuously read the data during the first enumeration, but on its start, so it's safe to pass a large enumerable to it).

It's also important that prefetch optimizes the sequence of queries it plans to run relying on delayed queries - i.e. in most of cases there will be just one additional batch.

Such an approach has its own pros and cons over standard one (with JOINs):

Pros:

  • The original query isn't modified, so query plan is more predictable in this case; moreover, since the number of joins significantly affects on optimizer's decisions, quite likely it will be more efficient because of this. The tail of [this post] proves this.
  • This approach doesn't lead to "explosion" of number of rows, if you prefetch a set of relationships.
  • Everything that is already fetched isn't repeatedly transmitted from DB server. So if we imagine there is a lot of RAM and DO already has second level cache (actually, it doesn't have it yet), this means most of prefetch requests won't hit the DB at all.

Cons:

  • Original query must be executed is separate batch before prefetch could start working (since it needs live data to queue its own queries).

We use similar strategy for TPT inheritance as well - i.e. try to load less first without torturing the DB, and then load everything else using just index seeks.


On 3:

All depends on the number of entities you cache in DS. If the timings you show are produced on few thousands of entities, that's not OK. But if this result was produced on e.g. 100K of entities, I'd consider this as fully acceptable.

Also note that DisconnectedState is designed to operate mainly on the client, so actually we didn't seriously fight for its raw performance on large number of entities (working set must be limited to 10-100K of entities at max there). There are optimizations related to specific cases - e.g. it maintains backreference map to support fast entity deletion with reference cleanup; commits and rollbacks also require ~ constant time there. On the other hand, we never seriously optimize it for e.g. flat enumeration - because of above reason. In short, we tried to eliminate any timings worse than O(N*log(N)) for any simple operation.

What further options are there in DO 4.4 (O2O mapping would only be the last resort) to increase the speed?

Could you describe your specific case? Mainly, how many entities do you enumerate this way?

Using transactions is also a necessary performance optimization here (or you should turn off auto transactions in session options).

The planned Caching API in DO 4.5 would be a great help.

Actually, it won't: L2 cache will operate on Domain level, its main purpose is to eliminate unnecessary roundtrips on fetches and prefetches. It will cache query results as well, but again, this feature will be used only if you'd hit the DB otherwise - but in this case you don't hit the DB.


On 4:

True, that's default behavior. DO does this to refresh versions of the entities to actual ones. There are several ways to affect on this behavior:

  • Remove VersionsUsageOptions.Update flag from VersionsUsageOptions.
  • Specify your own VersionUpdateFilter predicate (you can make a decision based on key type there).
  • Finally, you can use your own VersionsProvider.

answered Feb 09 '11 at 08:04

Alex%20Yakunin's gravatar image

Alex Yakunin
29714412

On 4: Just to make sure I understand the scope of the version update functionality that occurs after an update. Is only the Version updated in the DisconnectedState entities from the DB? Or all of the entity's properties as well? Or all of the properties when a difference is found in the the Version? I ask because some business logic on the server side might modify some properties and if only the Version gets updated then the entities in the DisconnectedState may become partially out of sync...

(Feb 09 '11 at 10:36) pcournoyer pcournoyer's gravatar image

On 4:

Does it make sense to have another option where only the entites modified by the update are refreshed afterwards? This would seem to cover most of the cases at minimum performance cost?

(Feb 09 '11 at 10:39) pcournoyer pcournoyer's gravatar image

On 4:

1) Yes, only versions are updated on saving, but not the data itself. If you need to update entity states, you should do this manually.

2) VersionsUsageOptions.Update is also used on fetches, etc. - so blocking this option in general isn't desirable; setting VersionUpdateFilter to key => false is much better way here. Likely, we must use this filter by default (taking into account 1).

(Feb 09 '11 at 14:02) Alex Yakunin Alex%20Yakunin's gravatar image

Currently it's recommended to completely refresh the whole part that might by affected after update by queries (i.e. have a special routine for this). Dicrionaries and any other "static" content can be cached in separate DS and merged into the current on after or before such refresh.

(Feb 09 '11 at 14:02) Alex Yakunin Alex%20Yakunin's gravatar image
Your answer
Please start posting your answer anonymously - your answer will be saved within the current session and published after you log in or create a new account. Please try to give a substantial answer, for discussions, please use comments and please do remember to vote (after you log in)!
toggle preview

powered by OSQA