New Posts New Posts RSS Feed: Lazy-Loaded BLOB Property
  FAQ FAQ  Forum Search   Calendar   Register Register  Login Login

Lazy-Loaded BLOB Property

 Post Reply Post Reply
Author
WardBell View Drop Down
IdeaBlade
IdeaBlade
Avatar

Joined: 31-Mar-2009
Location: Emeryville, CA,
Posts: 338
Post Options Post Options   Quote WardBell Quote  Post ReplyReply Direct Link To This Post Topic: Lazy-Loaded BLOB Property
    Posted: 22-Apr-2010 at 4:45pm
NEW "TABLE SPLITTING" ANSWER
 
I just learned about a new feature in Entity Framework v.4 called "Table Splitting".
 
It isn't the lazy property we really want but it gives us a foundation for a decent simulation of lazy property ... and it works now, with DevForce 2010.
 
Btw, Danny Simmons (EF Team) says they wanted to get Lazy Property into EF 4 ... and ran out of time. At least it's on their roadmap.
 
Table Splitting
 
Apparently EF always had the ability to map two entities to the same table, "splitting" the columns/properties between the two.
 
They called it "Table Splitting", not to be confused with "Entity Splitting" ... which is when a single entity is mapped to two separate tables. "Table Splitting" and "Entity Splitting" are opposites. Just try to remember which is which :-)
 
 
In our example of "Table Splitting", we create an Employee entity (with properties mapped to every column EXCEPT photo) and an EmployePhoto entity (with two properties, EmployeeID and Photo, mapped appropriately).
 
I said earlier that EF didn't support this trick. I wasn't entirely right about that. Apparently EF v.1 supported it. Unfortunately, the EF Designer did not and would refuse to validate the model. This circumstance parallels EF v.1's support for complex types; yes, they were there ... but the Designer threw a fit if you tried to use it.
 
You can read all about "Table Splitting" ... and how to do it ... from Gil Fink here:

http://blogs.microsoft.co.il/blogs/gilf/archive/2009/10/13/table-splitting-in-entity-framework.aspx

 
Here's some example DevForce code to illustrate the point:
 

  void GetEmployeesWithLazyPhoto {

 
      var em = new NorthwindIBEntities();
 
      // Get first three employees ... without their photos
      var emps= em.Employees.Take(3).ToList();
 
      // Prove that we did not get the EmployeePhoto entities at the same time
      var photos = em.EmployeePhotos.With(QueryStrategy.CacheOnly).ToList();
      System.Diagnostics.Debug.Assert(!photos.Any()); // there aren't any photos
 
      // Get the photo of the first employee
      var firstPhoto = emps.First().EmployeePhoto;

      System.Diagnostics.Debug.Assert(

         !firstPhoto.EntityAspect.IsNullEntity);// now we have it

  }

 
---
Notes:
 
(1) Yes, you can use INCLUDE to get the photos eagerly with the Employees:
 
   var emps= em.Employees.Include("EmployeePhoto").Take(3).ToList();
 
(2) Employee must be the "principal" entity and EmployeePhoto must be the "dependent" entity to ensure that a new Employee is always inserted before EmployeePhoto. You'll get that arrangement if you follow Mr. Fink's recipe.
 
I've tested the query. I haven't tested adding new Employees or EmployeePhotos. This note is more of a prediction than a prescription. I'm not sure what happens if you try to add a new EmployeePhoto before adding its Employee.
 
(3) If you want to be able to create new Employee objects, the Photo column in the Employee table must either be nullable or have a database default value. Entity Framework has to be able to insert a new Employee table row without specifying a value for the Photo column. Nothing new here ... just good old SQL.
 
(4) The entity referential integrity check should take care of deleting the EmployeePhoto when you delete the parent Employee. Haven't confirmed yet.  Of course deleting the Employee effectively deletes the EmployeePhoto regardless ... because deleting the table row deletes both. 
 
I do not know yet what happens if you try to delete EmployeePhoto. Presumably the Referential Integrity check will prevent this.
 
The net of these considerations is that you should hide them all from Employee consumers. You don't want them worrying about this stuff. You want them to think of the photo as part of the Employee ... not as another entity.
 
If you don't need to search for Photos independent of the Employee (and why would you?), I would try to make the EmployeePhoto entity internal to the domain model assembly.
 
[Again, haven't tried this yet either. Thinking out loud.]
 
Of course that means no consumer would ever see the EmployeePhoto entity. That's just fine. I'd make the Employee.EmployeePhoto navigation property internal as well or maybe private.
 
I'd create a public Photo class (if necessary). I'd add a custom Photo property for Employee, implemented such that it used the non-public EmployeePhoto property to acquire the photo and poured the photo bytes into an instance of the Photo class; this Photo class instance, not the EmployeePhoto entity, is what I'd return to the caller.
 
I'd also bury the mechanics of making a new EmployeePhoto inside this custom Employee.Photo property. Or maybe I'd create Employee.AddPhoto, Employee.UpdatePhoto, and Employee.DeletePhoto methods instead. It's up to you. The point is to encapsulate the mess.
 
Now it may be that the caller can no longer compose the query that eagerly loads EmployeePhotos with their employees (see Note #1).
 
I haven't tried this yet (as I keep saying). If so, that's fine with me. If I want eager photo loading, I should cover the query in my Repository class which DOES have access to the internal EmployeePhoto entity. I don't want outsiders to see me making sausage.
 
You do have a Repository class, right? You don't let your UI query all by itself do you? That's not good design. Once you start thinking seriously about how and when to expose entities, you'll know that you want to encapsulate all the persistence machinery - including DevForce persistence machinery - in a Repository or "DataService" class.
 
For the last time, I haven't tried these refinements yet. I'm broaching them in case you want to be a pioneer.
 
Back to Top
WardBell View Drop Down
IdeaBlade
IdeaBlade
Avatar

Joined: 31-Mar-2009
Location: Emeryville, CA,
Posts: 338
Post Options Post Options   Quote WardBell Quote  Post ReplyReply Direct Link To This Post Posted: 21-Apr-2010 at 12:18am

The lazy property feature is not presently on the near term schedule (although this conversation has certainly got my attention).

I'd like to help you work through to something that is satisfactory.
 
How many of these overstuffed tables do you have :-) ?
Which of my "possible solutions" did you have in mind?
 
We should take this out of the forum I think. Please contact me at AskWard@ideablade.com
Back to Top
kshaban View Drop Down
Newbie
Newbie


Joined: 18-Apr-2010
Posts: 2
Post Options Post Options   Quote kshaban Quote  Post ReplyReply Direct Link To This Post Posted: 20-Apr-2010 at 10:22pm
Hi Ward,
 
Unfortunately option 1 is not feasible due to this being an existing 3.5 app built on top of Linq to SQL (which BTW does support lazy-properties through DelayValue<T> backing fields)
 
The existing model has many delayed blobs and text properties.  Our "version" of Linq to SQL is a bit hacked up to provide future queries and the ability to do exactly like you alluded to with making view objects that are filled using projections.
 
However, our view objects are fully live since we also manage state and identity management.  We actually construct our own update and insert TSQL as well to conpensate for a lake of batching of update and insert statements by Linq to SQL.
 
Of course, this being all said we did not want to build all this nor do we want to continue to maintain our own home grown ORM but would much rather transition to a full fledged framework with true N-tier support.
 
What you referred to as possible solution, would it involve fully live view objects?
 
If the lazy-property features are something in the pipeline I would be happy to beta test for you as well.
 
Thanks,
 
Kavan
Back to Top
WardBell View Drop Down
IdeaBlade
IdeaBlade
Avatar

Joined: 31-Mar-2009
Location: Emeryville, CA,
Posts: 338
Post Options Post Options   Quote WardBell Quote  Post ReplyReply Direct Link To This Post Posted: 20-Apr-2010 at 8:04pm
It appears that at least two other ORMs support lazy loaded properties:
 
 
LLBLGen (http://www.llblgen.com/documentation/2.6/hh_start.htm ... says Franz; although I couldn't find the exact page I'm sure it's in there.)
 
Kudos to both.
 
As Ayende observes "This feature is mostly meant for unique circumstances, such as Person.Image, Post.Text, etc. As usual, be cautious in over using it."
 
Amen to that.
 
We'll get to it at some point (unless EF gets there first).
Back to Top
WardBell View Drop Down
IdeaBlade
IdeaBlade
Avatar

Joined: 31-Mar-2009
Location: Emeryville, CA,
Posts: 338
Post Options Post Options   Quote WardBell Quote  Post ReplyReply Direct Link To This Post Posted: 19-Apr-2010 at 6:49pm
Hi Kavan -
 
I should have written this reply before the one above.
 
We have long recognized that many people have big tables and would like to lazily load some of the properties. All the machinations I described could be hidden by a sufficiently powerful framework.
 
I don't know a single Object Relational Mapping (ORM) framework that supports lazy properties. If you know of one, do tell.
 
Why do you not see lazy properties in Entity Framework or anywhere else? Because it's hard.
 
But we're in the business of "hard". We are going to add this feature some day.  We've talked about it. We know how to do it. It's just a matter of priorities.
 
W
Back to Top
WardBell View Drop Down
IdeaBlade
IdeaBlade
Avatar

Joined: 31-Mar-2009
Location: Emeryville, CA,
Posts: 338
Post Options Post Options   Quote WardBell Quote  Post ReplyReply Direct Link To This Post Posted: 19-Apr-2010 at 6:41pm
Hi Kavan -
 
[NEW "TABLE SPLITTING" ANSWER IN COMMENT BELOW
  keeping this answer intact for continuity / posterity       ]
 
Great question ... and one that troubles everyone who confronts a table with a blob in it ... whether you use DevForce, EF, ... another ORM ... raw ADO ... anything.
 
The best answer is always the same: get that big blog OUT of the table. Move it out to another table. Then build a related entity mapped to that blob table to which you can navigate.
 
Let's say you have an Employee and the Employee's Photo. Move the Photo out of the Employee table. If you're going to keep the photo in the database (not the only approach), put it in an EmployeePhoto table. Then you have an Employee entity that is trim and an EmployeePhoto entity which you fetch on demand (asynchronously one hopes).
 
Whenever I see this question, I give this advice. When I give this advice, people either take it ... or they tell me that they can't.
 
The only acceptable reason for taking my advice ... is if you have legacy code that depends upon the blob being there and you are truly stuck with that legacy code.
 
Now I would fight like a cornered rat to escape this bind. I'd try to fake out the legacy code if I code. Maybe define a view and have the legacy code point to the view. All the better if the legacy code uses stored procedures to update the Employee. Then you split the table behind the scenes and no one is the wiser.
 
If you are trapped, I will continue with some alternatives. But if there is any possible way of moving the blob out ... stop ... read no further ... please just do it.
 
----
 
Ok ... you're still reading ... which means that you're determined to keep the blob in the table.
 
This will cost you. It will cost you in complexity if nothing else.  Here we go.
 
The Project + CUD Approach
 
You can separate the Query-for-Read code path from the Create/Update/Delete (CUD) code path.
 
When you want Employees for presentation purposes, you issue a "projection" query that returns only the properties of Employee that you want to retrieve in a list.
 
Anonymous Type Query
 
Here' how you'd write a query to project Employee into an anonymous type
 
  var q1= EM.Employees
            .Where(e => e.City == "Boston") // Example select
 
  // Here comes the "projection"
  var q2 = q1.Select(e => new {ID = e.ID, FirstName = e.FirstName, ...}) // Everything but the photo
 
  // We'll use a synchronous queries for this example; you'd use async in the real world. 
  var emps = q2.ToList();
 
Unfortunately, "emps" is a list of some anonymous type. That's pretty hard to work with.
 
You want a client-side type that is public, that you can talk about, potentially bind to. The type will be strictly read only ... you won't use it to for editing. It's simply a Data Transfer Object (DTO).
 
Suppose we call it "EmployeeDto". We want to fill it on the Server-side and have DevForce move it over the wire for us. So we'll make it serializable. We'll write it as part of our Server project and link to it in our Silverlight project (I'm assuming Silverlight here):
 
  [DataContract]
  [ReadOnly(true)] // UI hint to make entire class ReadOnly
  public partial class EmployeeDto: IKnownType {
    [DataMember] public int ID { get; set; }
    [DataMember] public string FirstName { get; set; }
    // More core fields
  }
 
We rewrite our query like so:
 
  var q2 = q1.Select( e =>
                new EmployeeDto {
                  ID = e.ID, FirstName = e.FirstName, ...} // Everything but the photo
               )
 
Ok, now that you have EmployeeDtos, what do you do with them? You present them.
 
How do you perform CUD operations? Not on the EmployeeDtos! When it's time to make changes, you work with whole Employee entities.
 
Presumably, you'll do so one at a time and happily pay the price of bringing down the blob as needed. When the user selects an EmployeeDTO and "drills in to edit", you'll extract the Employee's ID and fetch the matching Employee from the database.
 
Now you have the full Employee ... with the photo. But you were going to get the photo anyway, right? At least you're only getting one photo, not every photo of every employee in the list.
 
Important note: EmployeeDto is NOT an entity. EmployeeDto is just a read-only bag of Employee data.  The query result is not in cache. If you query aqain with the same criteria you'll get a conceptually duplicate set of EmployeeDtos.
 
You shouldn't edit one of these things - don't be fooled by the public setters. You've got no validation or other business logic. You can't save this data back to the database (as it is now ... we have tricks ... but that's a different post).
 
The DTO has no navigation properties to any other entities. You can enrich the DTO with such properties if you like; it's not hard but it's plumbing you'll have to write.
 
Oh ... don't forget to "update" your EmployeeDto after the user edits the corresponding Employee. Remember that the Employee and the EmployeeDto have no knowledge of each other.
 
This is not what you signed up for when you chose DevForce. You're going back to managing your data by hand.  If you're doing a lot of this kind of thing, stop ... and reflect ... because you're not using DevForce as it was intended to be used. You are developing in a different paradigm. Not a bad paradigm, just a different one.
 
There is a way to get closer to the DevForce paradigm. It requires a little more setup. It's the View approach.
 
The View Approach
 
Question: is the blob field nullable.   Please say "yes".  Because, if it is nullable, you can define in your database an updatable view over the Employee Table ... a view that excludes the EmployeePhoto column. Then you can create another view consisting of just the EmployeeID and the EmployeePhoto. And then you create two entities AS IF there were two tables.
 
Entity Framework (and, therefore, DevForce) will permit you to define ReadOnly entities for these views. You can define Insert, Update, and Delete methods in EF for the Employee entity. For EmployeePhoto you should define Update and Delete ... never Insert; the act of creating a new Employee will result in an EmployeePhoto.
 
Consult Julie Lerman's book on Entity Framework for details on these aspects of Entity Framework. She'll tell you how to write the CUD methods that make view-backed entities modifiable.
 
I'm not going to go further into the details of this approach. To be honest, I haven't checked on our support for View CUD methods in DevForce 2010 RC and we may have postponed that support until a future release.
 
We will support it and, even if we don't have that support baked in right now, I have a workaround in my back pocket to tide you over.  I'm sure you'll let me know if this is the way you want to go.
 


Edited by WardBell - 22-Apr-2010 at 3:59pm
Back to Top
kshaban View Drop Down
Newbie
Newbie


Joined: 18-Apr-2010
Posts: 2
Post Options Post Options   Quote kshaban Quote  Post ReplyReply Direct Link To This Post Posted: 18-Apr-2010 at 11:41pm
DevForce 10.0 looks absolutely great.
 
However, I am stumped...
 
How do I setup lazy loading of large properties on a BO?
 
Thanks in advance,
 
Kavan
Back to Top
 Post Reply Post Reply

Forum Jump Forum Permissions View Drop Down