Look at the line marked with //***

var ce2 = Query.All<CalendarEvent>().Prefetch(x => x.DbState).First(); //***
            var state2 = ce.DbState;

    [HierarchyRoot, Serializable]
    [TableMapping("States")]
    [KeyGenerator(KeyGeneratorKind.None)]
    public class DbState : Entity
    {
        [Field, Key(0)]
        public string ID { get; private set; }

        [Field]
        public string UrlName { get; set; }
    }

    [HierarchyRoot, Serializable]
    [TableMapping("CalendarEvents")]
    public class CalendarEvent : Entity
    {
        [Field, Key]
        public int ID { get; private set; }
        [Field(Nullable = true)]
        [FieldMapping("StateID")]
        public DbState DbState { get; set; }
    }

This will send two batches: 1. get all calendarevents 2 get all needed states

problems: 1. it will fetch all calendar events because it looses the .First() 2. it will to a sencond batch to get the state. However it is important to do this in one query with a join because i have a materialized view covering this join. I know of the pattern .Select({ce, ce.DbState}) but the question is: why doesn't the prefetch api do this? It would be kind of a problem to me to migrate all places in my application that are reling on prefetch to the Select solution.

This thread was imported from our support forum. The original discussion may contain more detailed answer.

asked Aug 02 '10 at 15:46

tobi's gravatar image

tobi
13556

edited Sep 03 '10 at 16:16

Alex%20Yakunin's gravatar image

Alex Yakunin
29714412


One Answer:
  1. Note that prefetch API is API to IEnumerable, not IQueryable. It acts efficiently, but requires sequences, not queries. Why? I'll explain this later here.

So to fix the issue #1, you should use this code:

var ce2 = Query.All<CalendarEvent>().Take(1).Prefetch(x => x.DbState).First();

Btw, I'm not sure if .Take will work in this case: likely, you must put .OrderBy before it, since by default result is unordered. I don't remember if we "inject" implicit ordering by PK in such cases, or simply throw an exception on such attempts, so you should check this. But e.g. trying to do the same in SQL (usage of top N w/o order by clause) should lead to failure.


  1. Prefetch API really executes additional batches during enumeration of original sequence. We use this implementation because of the following assumptions:

a) Such approach allows to use much more fine-granted control over prefetch process. E.g.:

  • If necessary data is already available, it won't be queried. Will be quite important, when we'll add global cache.

  • We support limits on prefetched count of items for collections by this way - again, very important, if there are large collections

  • We enable you to conditionally prefetch the data - in fact, all you need is to filter it out during enumeration

b) Regular way of prefetching with JOINs has a set of significant disadvantages:

  • Cartesian product issue: result set size grows rapidly with each prefetch. In fact, it is almost useless, if prefetch depth is 2 or greater. There is no such issue in our case.

  • Costly query plan: it's well-known fact that JOIN order optimization (look for this in Google) is hard. Only heuristic algorithms work here, if number of joins is high, and thus the result of this optimization isn't well-predictable. So if in our case .Prefetch leads to predictable expenses (btw, in our case there are simple lookups), you can get completely opposite result in frameworks doing the same with JOIN.

  • In our case necessary data is continuously prefetched during iteration - i.e. we don't prefetch everything right on enumeration. Imagine what happens in case with JOIN.

c) Finally, you can prefetch a reference with .Select in DO. The same works for collections as well, but in this case we actually send additional queries during iteration (i.e. this works nearly as in case with regular .Prefetch). Possibly, later we'll use JOIN also for .Select for collection property, but definitely just for the first one (i.e. there will be 1 or 2 additional JOINs at max, dependently if used EntitySet<t> is mapped on intermediate Entity type or not).

d) Getting executed an additional batch won't significantly affect on performance:

  • As it was mentioned, there is fully predictable cost.

  • Internally we use a low-level analogue of future queries for this (query translation results are cached as well), so we can run nearly 5-10K of such batches per second. Compare this to 1000K LINQ queries per second in NHibernate (see http://ormbattle.net/ ; it worth to mention we launched this web site, but it is supported by commercial ORM vendors participating there now).

  • Finally, see http://ormbattle.net/index.php/blog/105 ... rnate.html - there is some truth about performance of queries with "left join fetch" in NHibernate. Materialization performance in NH is far from ideal, but it seems in this case Cartesian product simply slashes it at all, so finally is performs nearly with the same speed as EF without any prefetch (I can't believe Fabio didn't know there is .Include method in EF, but he didn't use it in that test).

So short answer is: don't fear .Prefetch. If you'll discover any issue with its performance further, just notify us.


I'd like to finish with practical example. Here is a quote from chat with one of our customers:

[20:33:41] CB: i am still in shock [20:33:47] CB: however it is what i expected [20:33:56] CB: so all i can say is GOOD JOB GUYS!!!

var owner = Query.All<Owner>()
    .Where(o => o.UserID == userId)
    .Prefetch(o => o.Beneficiary)
    .Prefetch(o => o.Document, documents => documents
        .Prefetch(d => d.Lookup))
    .Prefetch(o => o.Product)
    .Prefetch(o => o.Property)
    .Prefetch(o => o.Event)
    .Prefetch(o => o.SecurityQuestion)
    .Prefetch(o => o.Lookup)
    .Prefetch(o => o.Image, images => images
        .Prefetch(i => i.Property)
        .Prefetch(i => i.Area)
        .Prefetch(i => i.Asset))
    .Prefetch(o => o.Property, properties => properties
        .Prefetch(p => p.Image)
        .Prefetch(p => p.PrimaryBeneficiary)
        .Prefetch(p => p.PropertType)
        .Prefetch(p => p.GarageType)
        .Prefetch(p => p.PlumbingType)
        .Prefetch(p => p.RoofType)
        .PrefetchSingle(p => p.PropertType, lookup => lookup.Prefetch(l => l.LookupType))
        .PrefetchSingle(p => p.GarageType, lookup => lookup.Prefetch(l => l.LookupType))
        .PrefetchSingle(p => p.RoofType, lookup => lookup.Prefetch(l => l.LookupType))
        .PrefetchSingle(p => p.PlumbingType, lookup => lookup.Prefetch(l => l.LookupType))
    ).ToList();

This query with .Prefetch works 50...100 times faster in DO4 in comparison to EF. Or, more precisely, in our case it simply works as expected, but in case with EF it doesn't.

Note that nothing complex is actually written here: the original query is Query.All<owner>().Where(o => o.UserID == userId), everything else are prefetch expressions.


Just noticed one more thing: you said Query.All<calendarevent>().Prefetch(x => x.DbState).First() enumerates all CalendarEvents internally. That's not true: try running this on SQL Server with MARS turned on and with varying number of CalendarEvents in the database (e.g. 100K of such items). I suspect you won't notice any changes in performance at all, since internal enumerator of .Prefetch will iterate Query.All<calendarevent>() enumerator (relying on MARS) for some constant number of items only (AFAIK, 128...256 - the number of items it prefetches at once). And since the original query doesn't require any computation (it's just PK enumeration), the actual cost of this query is proportional to the size of this set, i.e. it is constant :)

But certainly, version with .Take(...) is faster.

answered Aug 02 '10 at 19:46

Alex%20Yakunin's gravatar image

Alex Yakunin
29714412

Your answer
Please start posting your answer anonymously - your answer will be saved within the current session and published after you log in or create a new account. Please try to give a substantial answer, for discussions, please use comments and please do remember to vote (after you log in)!
toggle preview

powered by OSQA