That Conference 2017 – Concurrent Programming in .NET

That Conference 2017, Kalahari Resort, Lake Delton, WI
Concurrent Programming in .NET – Jason Bock (@JasonBock)

Day 2, 8 Aug 2017

Disclaimer: This post contains my own thoughts and notes based on attending That Conference 2017 presentations. Some content maps directly to what was originally presented. Other content is paraphrased or represents my own thoughts and opinions and should not be construed as reflecting the opinion of the speakers.

Executive Summary

  • Doing concurrency correctly is very hard when doing it at a lower level
  • Don’t use Thread, ThreadPool anymore
  • Learn and use async/await


Background

  • Concurrency is tough, not as easy as doing things sequentially
  • Games are turn-based
    • But interesting exercise to do chess concurrently
    • Much more complicated, just as a game
  • We’ll just look at typical things that people struggle with


Terms

  • Concurrency – Work briefly and separately on each task, one at a time
  • Parallelism – People all working at the same time, doing separate tasks
  • Asynchronous – One you start a task, you can go off and do something else in the meantime


Recommendations

  • Stop using Thread, ThreadPool directly
  • Understand async/await/Task
  • Use locks wisely
  • Use concurrent and immutable data structures
  • Let someone else worry about concurrency (actors)


Stop Using Thread, ThreadPool

  • This was the way to do concurrency back in .NET 1.0
  • Diagram – Concurrent entities on Windows platform
    • Process – at least one thread
    • Thread – separate thread of execution
    • Job – group of processes
    • Fiber – user-level concurrent tasks
  • But mostly we focus on process and thread
  • Example–multiple threads
    • Use Join to observe at end
  • Problems
    • Stack size
    • # physical cores
  • # Cores
    • May not really be doing things in parallel
    • More threads than cores–context switch and run one at a time on a core
    • Context switches potentially slow things down
    • Don’t just blindly launch a bunch of threads
  • Memory
    • Each thread -> 1 MB memory
  • ThreadPool is an improvement
    • You don’t know when threads are complete
    • Have to use EventWaitHandle, WaitAll
    • Lots of manual works
    • At least threads get reused
  • Existing code that uses Threads works just fine


Async/Await/Tasks

  • Models
    • Asynchronous Programming Model (APM)
    • Event-based Asynchronous Patter (EAP)
    • Task-Based Asynchrony (TAP)
  • Demo – console app
    • AsyncContext.Run – can’t call async method from non-async method
    • AsyncContext – from Stephen Cleary – excellent book
    • This goes away in C# 7.1 – compiler will allow calling async method
    • Reading file
  • Misconception – that you create threads when you call async method
    • No, not true
    • Just because method is asynchronous, it won’t necessarily be on another thread
  • Use async when doing I/O bound stuff
    • Calling ReadLineAsync, you hit IO Completion Point; when done, let caller know
    • When I/O is done, calling thread can continue
    • Asynchronous calls may also not necessarily hit IOCP
  • If you do I/O in non-blocking way on a thread, you can then use thread to do CPU work while the I/O is happening
    • Performance really does matter–e.g. use fewer servers
  • When you call async method, compiler creates asynchronous state machine
    • Eric Lippert’s continuation blog post series
  • Task object has IsCompleted
    • Generated code need to first check state to see if task completed right away
  • Good news is–that you don’t need to write the asynch state machine plumbing code
  • Can use Tasks to run things in parallel
    • Task.Run(() => ..)
    • await Task.WhenAll(t1, t2);
  • Tasks are higher level abstraction than threads
  • Don’t ever do async void
    • Only use it for event handlers
  • Keep in mind that async tasks may actually run synchronously, under the covers


Demo – Evolving Expressions

  • Code uses all cores available


Locks

  • Don’t use them (or try not to use them)
  • If you get them wrong, you can get deadlocks
  • Don’t lock on
    • Strings – You could block other uses of string constant
    • Integer – you don’t lock on the same thing each time, since you lock on boxed integer
  • Just use object to lock on
  • Interlocked.Add/Decrement
    • For incrementing/decrementing integers
    • Faster
  • Tool recommendation: benchmark.net
  • SpinLock
    • Enter / Exit
    • Spins in a while loop
    • Can be slower than lock statement
    • Don’t use SpinLock unless you know it will be faster than other methods


Data Structures

  • List<T> is not thread-safe
  • ConcurrentStack<T>
    • Use TryPop
  • ImmutableStack<T>
    • When you modify, you must capture the new stack, result of the operation
    • Objects within the collection can change


Actors

  • Service Fabric Reliable Actors
  • Benefits
    • Resource isolation
    • Asynchronous communication, but single-threaded
      • The actor itself is single-threaded
      • So in writing the actor, you don’t need to worry about thread safety


Demo – Actors – Using Orleans

  • “Grains” for actors

The Baby Has Two Eyeballs

eyeballs

Driving home the other night, I was teasing my 4-yr old daughter (as I often do).  We were talking about what sorts of games to play when we got home and I suggested that we could spend our time poking Lucy’s baby brother Daniel in the eye.  I also pointed out that we’d have to take turns and asked Lucy which of us should go first.

Lucy protested.  “Daddy”, she said excitedly, “we don’t have to take turns–Daniel has two eyeballs”!

After I stopped laughing at this, I realized that Lucy had seen something very important, which hadn’t even occurred to me:

Poking a baby in the eye is something that can be done in parallel!

Yes, yes, of course–you should never poke a baby in the eye.  And if you have a pre-schooler, you should never suggest eye-poking as a legitimate game to play, even jokingly.  Of course I did explain to Lucy that we really can’t poke Daniel in the eye(s).

But Lucy’s observation pointed out to me how limited I’d been in my thinking.  I’d just assumed that poking Daniel in the eye was something that we’d do serially.  First I would poke Dan in the eye.  And then, once finished, Lucy would step up and take her turn poking Dan in the eye.  Lucy hadn’t made this assumption.  She immediately realized that we could make use of both eyeballs at the same time.

There’s an obvious parallel here to how we think about writing software.  We’ve been programming on single-CPU systems for so long, that we automatically think of an algorithm as a series of steps to be performed one at a time.  We all remember the first programs that we wrote and how we learned about how computers work.  The computer takes our list of instructions and patiently executes one at a time, from top to bottom.

But, of course, we no longer program in a single-CPU environment.  Most of today’s desktop (and even laptop) machines come with dual-core CPUs.  We’re also seeing more and more quad-core machines appear on desktops, even in the office.  We’ll likely see an 8-core system from Intel in early 2010 and even 16-core machines before too long.

So what does this mean to the average software developer?

We need to stop thinking serially when writing new software.  It just doesn’t cut it anymore to write an application that does all of its work in a single thread, from start to finish.  Customers are going to be buying “faster” machines which are not technically faster, but have more processors.  And they’ll want to know why your software doesn’t run any faster.

As developers, we need to start thinking in parallel.  We have to learn how to decompose our algorithms into groups of related tasks, many of which can be done in parallel.

This is a paradigm shift in how we design software.  In the same way that we’ve been going through a multiprocessing hardware revolution, we need to embark on a similar revolution in the world of software design.

There are plenty of resources out there to help us dive into the world of parallel computing.  A quick search for books reveals:

Introduction to Parallel Computing – Grama, Karypis, Kumar & Gupta, 2003.

The Art of Multiprocessor Programming – Herlihy & Shavit, 2008.

Principles of Parallel Programming – Lin & Snyder, 2008.

Patterns for Parallel Programming – Mattson, Sanders & Massengill, 2004.

The following paper is also a good overview of hardware and software issues:

The Landscape of Parallel Computing Research: A View from Berkeley – Asanovic et al, 18 Dec 2006.

My point here is this:  As a software developer, it’s critical that you start thinking about parallel computing–not just as some specialized set of techniques that you might use someday, but as one of the primary tools in your toolbox.

After all, today’s two-eyed baby will be sporting 16 eyes before we know it.  Are you ready to do some serious eye-poking?