TechEd North America 2014, Houston
Industrial-Strength Entity Framework – John Mason, Robert Vettor
Day 3, 14 May 2014, 5:00PM-6:15PM (DEV-B356)
Disclaimer: This post contains my own thoughts and notes based on attending TechEd North America 2014 presentations. Some content maps directly to what was originally presented. Other content is paraphrased or represents my own thoughts and opinions and should not be construed as reflecting the opinion of either Microsoft, the presenters or the speakers.
Executive Summary—Sean’s takeaways
-
Entity Data Model doesn’t have to be 1-1 mapping to database tables
- EDM represents model that the developer works with
-
Segregate models, based on functional context
- Complex system would typically have several models
- Data linking accomplished by bringing in only what’s needed from another model
- E.g. Order EDM has Customer info, but only some basic read-only Customer attributes
-
Transactions
- EF does implicit transaction on SaveChanges
- You can now do explicit transactions
-
Concurrency
- You need to be thinking about it
- Catch concurrency exceptions and do something about them (e.g. merge)
-
Performance
- Measure performance of queries with Profiler and Query Analyzer
- Some good tricks for using Profiler
- Stored Procedures are fine—first-class citizens in EF
- Use async—prevalent in EF now
- Query caching in EF—huge gains
-
For best performance, use EF 6 and .NET 4.5+
- Query caching requires EF 5, .NET 4.5
John Mason – Senior Developer Consultant, Microsoft Premier Services
Rob Vettor – Senior Developer Consultant, Microsoft Premier Services
Simple Layered (Multitier) Model
- People sometimes don’t understand what’s in each box
More Complex Layered (Multitier) Model
-
Service Layer blown out
- KPIs – business (transactions), devs (perfmon)
- Unit of Work – how to talk to Data Layer
Complex Layered Multitier Model
- This is more typical of modern multi-tiered system
-
Data Tier
- Data Sources might be database or something else
- EF will at some point be able to talk to non-relational data sources
- “Critically important” that we accept NoSQL model
- This is where Entity Framework fits into the large picture
Enterprise Apps are Data Aware
What is a Model?
- EF exposes Entity Data Model
- On the left side are database tables
-
On the right side, classes
- Code against data model that might not map 1-1 to database
- This is the object model
- Default behavior is 1-1 mapping between database table and a class
- Dev’s shape of the data vs. DBA shape of the data
- If DBA changes schema, only the mapping (arrows in middle) has to change
Model Simplifies the Data
-
Model simplifies data
- Validates your understanding
- Stakeholders can more easily understand
- Reduces complexity
-
Defining the model
- Sketch data model for enterprise
- Create initial mappings
- Identify functional groupings
-
Abstraction has cost
- Overhead of models
- Overloading with entire entity graph might not make sense
- Good idea to work with subset of entities
Model Segregation
-
Break model up into smaller functional context
- Can have several different models (EDMs) that segment the business domain
- Call these “functional contexts”
-
Benefits
- Each functional context targets specific area of business domain
- Reduces complexity
- Functionality can evolve and refactor can occur on separate timelines
Improve Model by Refactoring
- Customer vs. Order contexts
- Customer context – might be where you actually change customer
-
In Order context, just needs read-only reference to customer
- So you just pull in subset of tables/fields, read-only
-
Helps ensure that devs use the context properly
- Avoid architectural erosion—if you don’t understand architecture, you make changes that don’t match the original vision
- Also enables teams to work separately
Enterprise Apps are Transactional
- Typically
Transaction
-
Are
- Related operations, single unit of work
- Either succeeds or fails (Commit or Rollback)
- Guarantees data consistency / integrity
-
Basic types
-
Local
- Single-phased (1 database) handled directly by database provider
-
Distributed
- Affects multiple resources
- Coordinated by 3rd party transaction manager component
- Implement two-phased commit—present candidate changes, then go ahead and do it
-
Local Transactions
-
Implicit
- Built into EFs internal architecture – automatic
-
SaveChanges invokes implicit transaction
- EF automatically either commits or rolls back
-
Explicit – Two Types
- BeginTransaction
- UseTransaction
Demo – Transactions
-
Default behavior
- Profiler running in background, shows SQL being used
- After SaveChanges, do raw SQL Command—gets its own transaction
- Another set of changes and then SaveChanges—another BEGIN TRANSATION / COMMIT TRANSACTION
- Doing smaller transactions is good practice for highly transactional database
-
BeginTransaction (new in EF 6)
- ctx.Database.BeginTransaction
- trans.Commit
- Now we get everything within one transaction
-
UseTransaction
- Create new connection, outside of EF
- On Connection, BeginTransaction
- Create EF context, pass in conn, which has transaction
- ctx.Database.UseTransaction(trans)
- Still assumes single database
Best Practices – Transactions
-
Favor implicit local transactions
- All you need, usually
- Automatic when calling SaveChanges()
- Implements Unit of Work pattern
-
Explicit local transaction
- BeginTransaction, UseTransaction
-
Distributed transactions
- Favor TransactionScope
Enterprise Apps are Concurrent
Concurrency
- When multiple users modify data at the same time
-
Two models
-
Optimistic concurrency
- Pro: Rows not locked between initial query and update
- Con: Database value can be overwritten
-
Pessimistic concurrency
- Pro: Row locked until operation complete
- Con: Not scalable, excessive locking, lock escalation
- Con: No out-of-box support in ADO.NET and EF
-
Concurrency in EF
- Out-of-box, EF has optimistic concurrency – last in wins
- Can detect concurrency conflicts by adding concurrency token
- EF, on update, delete, automatically adds concurrency check to where clause
- If concurrency conflict, EF throws DbUpdateConcurrencyException
Resolve Concurrency Conflicts
-
Client Wins
- Out-of-box behavior
-
Server wins
- First in stays – your changes discarded
- Notify user of conflicts
-
Code custom algorithm
- Let business decide
- EF has: 1) original data; 2) your changed data; 3) data currently in database
Demo – Concurrency
- When you catch DbUpdateConcurrencException, good place to have strategy pattern
- E.g. custom merge
- GetDatabaseValues, OriginalValues, CurrentValues
- Merge is sometimes possible (if we’re not changing the same thing)
- On merge failure (true conflict), we log it
“Programming to DBContext” – Rowan Miller
Enterprise Apps are Performant
Performance in Apps
-
Defining performance
- Emergent property, blends latency and underlying infrastructure
-
Measuring performance
- Can’t improve what you don’t measure
- Business KPIs and application KPIs
- Use application KPIs to keep track of application metrics that are meaningful to you
- Who falls over first: business tier, database, etc.
-
Performance impactors
- Can buy your way out of bandwidth issues
- Can’t buy fix for latency
- Must be considered up front
Performance rules
- Choose right architecture
- Test early/often
- Whenever you make a change, re-test
Performance Pillar – Efficient Queries
Default Query Behavior
- LINQ query eventually transformed into SQL query
- EF materializes untyped SQL result set into strongly-typed entities
Query Performance
-
Is Your SQL Efficient?
- Customers assume that LINQ is smarter than it actually is
- Looks like SQL but doesn’t necessarily act like SQL
- EF may not be efficient
-
Must Profile LINQ Queries
- Structure of LINQ expressions can have tremendous impact on query performance
- EF can’t determine query efficiency from LINQ expression
-
The Truth is in the SQL execution plan
- Actual, not estimated
In a perfect world, you’d profile all of your queries
Profiling Check List
-
Set Application Name Attribute in conn string
- Allows filtering just SQL from your application
- Run SQL Trace
- Run Query Analyzer
- SET STATISTICS TIME ON/OFF
- Review Actual Execution Plan
Demo – Profiling
- Profile slow or complicated query
- Tools | SQL Server Profiler
- Start trace
-
Events Selection
- No Audit, ExistingConnection,
- Only show RPC Completed, SQL BatchCompleted
-
Column Filters
- Application Name like “EF” (or whatever you put in conn string)
- Run
- Run simple query
-
In Profiler, you see full SQL query
- Tells us what application is doing—just the query
- Copy out of Profiler
- New Query window
- Paste in
-
SET STATISTICS TIME ON
- (and OFF at bottom)
- Select database
- Query | Include Actual Execution Plan
-
Get rows back, took 94 ms, then 35 ms
- SQL caching
-
Look at Execution plan
- Favor Seeks over Scans
- Clustered Index Scan
- Given structure of database, we can’t improve on it
-
Look at how we’d write query
- Query plan is the same on your own query
- You really don’t know how query works until you look at Query Execution Plan
- If you know of a better way to get to the data, you can use a Stored Proc
Can change underlying LINQ statement
- Sometimes can find better way
What About Stored Procedures?
-
EF Loves to Generate SQL
- Default behavior – devs do LINQ, EF gens SQL
-
Sprocs always option
- Simple to map select/update ops to stored procs
- EF generates method from which to invoke
-
Consistency across EF
- Same experience
- Full change tracking
- Right tool for the Job
“Don’t fear the SPROC”
Performance Best Practices
-
Profile early/often
- Slow operations of complex queries
-
Must Haves…
- Application Name conn string attribute
- Set Statistics Time On/Off
- Actual vs. Estimated Executed Plan
-
Don’t fear the Sproc
- Stored procs are first-class citizens in EF
- Customers often blend both EF queries and stored procedures in hybrid approach
- Common customer usage pattern for large application
80/20
- 80% CRUD simple queries
-
20% more complex
- Good place for stored procedure
Performance Pillar Asynchronous Operations
Async
-
EF 6 Exposes Asynchronous Operations
- Most IQueryable<T> methods now async-enabled
- Leverages recent Async/Await pattern
-
Fast and Fluid Experience
- Increase client responsiveness, server scalability
- Prevents blocking of main thread
- *** more ***
Demo – Async
- Method that does async call, do await
- In async function, return Task
- await ctx.SaveChangesAsync()
- FirstOrDefaultAsync
Performance Pillar: Caching
Performance Enhancements
- Overall Query Construction
-
Performance Improvements
- EF 5 and EF 6 has big perf improvements
- And need .NET Framework 4.5+
Query Caching
-
Earlier
- Linq query
- Compiler builds expression tree
- EF parses tree and builds SQL query
- Repeated each time query executed
-
Now, Autocompiled Queries
- EF 5+, SQL query constructed once, stored in EF query cache and reused
- Parameterized – same query/different parameter values
- For your application, all user share this cache (App Domain)
- Supports up to 800 queries
- Uses MRU algorithm to decide which queries to drop
- This is (obviously) different from SQL caching
Closing Thoughts
Step 0 – Learn the Framework
- Framework is complex
-
Allocated time to learn
- Understand LINQ and subtle nuances of each
- Understand immediate vs. deferred execution
- For Microsoft Premier customers, Premier EF training
Books
- Programming Entity Framework
- DbContext
- Code First
- Entity Framework 4.1: Expert’s Cookbook
Entity Framework 6 Recipes – Rob Vettor