That Conference 2017 – Concurrent Programming in .NET

That Conference 2017, Kalahari Resort, Lake Delton, WI
Concurrent Programming in .NET – Jason Bock (@JasonBock)

Day 2, 8 Aug 2017

Disclaimer: This post contains my own thoughts and notes based on attending That Conference 2017 presentations. Some content maps directly to what was originally presented. Other content is paraphrased or represents my own thoughts and opinions and should not be construed as reflecting the opinion of the speakers.

Executive Summary

  • Doing concurrency correctly is very hard when doing it at a lower level
  • Don’t use Thread, ThreadPool anymore
  • Learn and use async/await


Background

  • Concurrency is tough, not as easy as doing things sequentially
  • Games are turn-based
    • But interesting exercise to do chess concurrently
    • Much more complicated, just as a game
  • We’ll just look at typical things that people struggle with


Terms

  • Concurrency – Work briefly and separately on each task, one at a time
  • Parallelism – People all working at the same time, doing separate tasks
  • Asynchronous – One you start a task, you can go off and do something else in the meantime


Recommendations

  • Stop using Thread, ThreadPool directly
  • Understand async/await/Task
  • Use locks wisely
  • Use concurrent and immutable data structures
  • Let someone else worry about concurrency (actors)


Stop Using Thread, ThreadPool

  • This was the way to do concurrency back in .NET 1.0
  • Diagram – Concurrent entities on Windows platform
    • Process – at least one thread
    • Thread – separate thread of execution
    • Job – group of processes
    • Fiber – user-level concurrent tasks
  • But mostly we focus on process and thread
  • Example–multiple threads
    • Use Join to observe at end
  • Problems
    • Stack size
    • # physical cores
  • # Cores
    • May not really be doing things in parallel
    • More threads than cores–context switch and run one at a time on a core
    • Context switches potentially slow things down
    • Don’t just blindly launch a bunch of threads
  • Memory
    • Each thread -> 1 MB memory
  • ThreadPool is an improvement
    • You don’t know when threads are complete
    • Have to use EventWaitHandle, WaitAll
    • Lots of manual works
    • At least threads get reused
  • Existing code that uses Threads works just fine


Async/Await/Tasks

  • Models
    • Asynchronous Programming Model (APM)
    • Event-based Asynchronous Patter (EAP)
    • Task-Based Asynchrony (TAP)
  • Demo – console app
    • AsyncContext.Run – can’t call async method from non-async method
    • AsyncContext – from Stephen Cleary – excellent book
    • This goes away in C# 7.1 – compiler will allow calling async method
    • Reading file
  • Misconception – that you create threads when you call async method
    • No, not true
    • Just because method is asynchronous, it won’t necessarily be on another thread
  • Use async when doing I/O bound stuff
    • Calling ReadLineAsync, you hit IO Completion Point; when done, let caller know
    • When I/O is done, calling thread can continue
    • Asynchronous calls may also not necessarily hit IOCP
  • If you do I/O in non-blocking way on a thread, you can then use thread to do CPU work while the I/O is happening
    • Performance really does matter–e.g. use fewer servers
  • When you call async method, compiler creates asynchronous state machine
    • Eric Lippert’s continuation blog post series
  • Task object has IsCompleted
    • Generated code need to first check state to see if task completed right away
  • Good news is–that you don’t need to write the asynch state machine plumbing code
  • Can use Tasks to run things in parallel
    • Task.Run(() => ..)
    • await Task.WhenAll(t1, t2);
  • Tasks are higher level abstraction than threads
  • Don’t ever do async void
    • Only use it for event handlers
  • Keep in mind that async tasks may actually run synchronously, under the covers


Demo – Evolving Expressions

  • Code uses all cores available


Locks

  • Don’t use them (or try not to use them)
  • If you get them wrong, you can get deadlocks
  • Don’t lock on
    • Strings – You could block other uses of string constant
    • Integer – you don’t lock on the same thing each time, since you lock on boxed integer
  • Just use object to lock on
  • Interlocked.Add/Decrement
    • For incrementing/decrementing integers
    • Faster
  • Tool recommendation: benchmark.net
  • SpinLock
    • Enter / Exit
    • Spins in a while loop
    • Can be slower than lock statement
    • Don’t use SpinLock unless you know it will be faster than other methods


Data Structures

  • List<T> is not thread-safe
  • ConcurrentStack<T>
    • Use TryPop
  • ImmutableStack<T>
    • When you modify, you must capture the new stack, result of the operation
    • Objects within the collection can change


Actors

  • Service Fabric Reliable Actors
  • Benefits
    • Resource isolation
    • Asynchronous communication, but single-threaded
      • The actor itself is single-threaded
      • So in writing the actor, you don’t need to worry about thread safety


Demo – Actors – Using Orleans

  • “Grains” for actors

Object Disposal and BackgroundWorker Object in .NET

Here are a few notes about best practices related to:

  • IDisposable and an object’s Dispose() method
  • The using statement
  • Disposal of BackgroundWorker objects

(NOTE: BackgroundWorker object is no longer the preferred mechanism for doing work on a background thread in C#, given that the language supports task-basked asynchrony with the async/await constructs. However, many legacy applications still make use of the BackgroundWorker class).

Q: What’s the goal of the using statement and IDisposable (Dispose) interface?

A: (short) To tell an object when it can clean up unmanaged resources that it might be hanging onto

A: (longer)

  • .NET code can make use of managed resources (e.g. instantiate another .NET object) or unmanaged resources (e.g. open a file to read from it)
  • Managed resources are released by the garbage collector (GC) automatically
    • Note that this is non-deterministic, i.e. you can’t predict when an object will be GC’d
  • To release an unmanaged resource, code typically follows this pattern:
    • Release resource in finalizer  (~ syntax).  Finalizer called during GC, so unmanaged resource is then released when object is being GC’d
    • Optionally, can support IDisposable (Dispose method)
      • Client calls Dispose before object is GC’d to released unmanaged resource earlier than normal GC
      • Allows for deterministic destruction
      • using statement automates calling of Dispose on an object
    • Classes implementing Dispose will still get GC’d normally at a later time
      • If Dispose was called first, code typically tells GC not to call its finalizer, since it’s already done stuff done by the finalizer (GC.SuppressFinalization)
      • If client failed to call Dispose, finalizer runs normally, so unmanaged resources then get cleaned up before GC
    • Objects with finalizers take a little bit longer to be GC’d
    • Here’s how IDispose is typically implemented – http://csharp.2000things.com/2011/10/11/430-a-dispose-pattern-example/

 

Q: When should I use the using statement?

A: Typically, you should use the using statement to invoke Dispose on any object that implements IDisposable

 

Q: What happens if I don’t call Dispose or use the using statement?

A: (short) Unmanaged resources are (typically) released a bit later than they otherwise would be

A: (longer)

  • If you don’t call Dispose on an object that implements IDisposable, it typically hangs onto unmanaged resources until it is GC’d and then releases them
  • Depending on the type of resource, the first object may block access to the resource until it’s released
  • Failing to use using (or call Dispose) typically doesn’t lead to a memory leak. Rather, it just means that resources are released a bit later

 

Q: Should I use a using statement for a BackgroundWorker object?

A: (short) Yes, since BackgroundWorker has Dispose method (although calling Dispose doesn’t actually do anything)

A: (longer)

  • It’s okay to use using on BackgroundWorker, since it does implement IDisposable
  • BackgroundWorker, however, doesn’t actually do anything when Dispose is called.  Its parent class, Component, detaches from its ISite container, but this is only relevant in Windows Forms.
  • Calling Dispose does suppress finalization, which means that the BackgroundWorker will be GC’d a little bit sooner.  This is reason enough to use using on the BackgroundWorker.
  • The using statement for a BackgroundWorker does nothing with the BackgroundWorker’s event handlers (i.e. it doesn’t detach any event handlers)

 

Q: Should I detach event handlers in the handler for RunWorkerCompleted?

A: (short) No, you (typically) don’t need to explicitly detach event handlers for a BackgroundWorker

A: (longer)

  • In .NET, if two objects reference each other, but no other “root” object references either of them, they do both get garbage collected
  • If we have a WPF form that has a class-level reference to a BackgroundWorker
    • Assume that we instantiate the BackgroundWorker when user does something on the form and attach handlers (methods in form) to that instance
    • Form now has ref to BackgroundWorker (class-level ref) and BW has ref to form (via the handlers)
    • When form closes, if main application no longer has a reference to the form, both the form and the BackgroundWorker will be properly garbage collected even though they reference each other
  • You do need to detach handlers if you have a BackgroundWorker that is meant to live longer than the object that owns the handlers
    • g. If we had an application-level BackgroundWorker and forms that attached handlers to its DoWork or RunWorkerCompleted events.  If the BW was meant to live after the form closes, you’d want to have the form detach its handlers when it closed.

 

 

Some Cute CYA Code in .NET Source

I stumbled upon the following comment in the .NET source code for the ItemsControl class.

In effect, the developer is saying–“I could do this a more elegant fashion.  I just want everyone to know that this is ugly because I was told to do it this way”.

        protected virtual void PrepareContainerForItemOverride(DependencyObject element, object item)
        {
            // Each type of "ItemContainer" element may require its own initialization.
            // We use explicit polymorphism via internal methods for this.
            //
            // Another way would be to define an interface IGeneratedItemContainer with
            // corresponding virtual "core" methods.  Base classes (ContentControl,
            // ItemsControl, ContentPresenter) would implement the interface
            // and forward the work to subclasses via the "core" methods.
            //
            // While this is better from an OO point of view, and extends to
            // 3rd-party elements used as containers, it exposes more public API.
            // Management considers this undesirable, hence the following rather
            // inelegant code.

Who can blame him? I’ve written comments like this myself. Source code as confessional.

Visual Studio 2010 Install Screenshots

Beta 1 of Visual Studio is now available on MSDN.  (If you have the appropriate MSDN subscription).  Here is a complete set of screenshots, outlining the installation experience.

Note: I installed VS 2010 Beta 1 on a clean virtual machine running Windows 7 Build 7100 (RC).

We start with the familiar install startup menu:

First screen

Then we get a banner page, as things start up.

Install Banner

Next, we get a license page, as well as an overview of what is going to be installed.  The key components are:

  • VC 9.0 and 10.0 runtime libraries
  • .NET Framework 4 Beta 1    (more info)
  • Help 3.0 Beta 1    (more info)
  • Visual Studio Macro Tools
  • Visual Studio 2010 Professional Beta 1    (more info)

License Page

Next up is an options page:

Options Page

Now the actual installation begins and we can see a more complete list of all the components that will be installed.  For completeness, here’s the full list:

  • VC 9.0 Runtime
  • VC 10.0 Runtime
  • Microsoft .NET Framework 4 Beta 1
  • Microsoft Help 3.0 Beta 1
  • Microsoft Visual Studio Macro Tools
  • Microsoft Visual Studio 2010 Professional Beta 1
  • Microsoft Web Deployment Tool
  • Visual Studio Tools for the Office System 4.0 Runtime
  • Microsoft Office Development Tools for Visual Studio 2010
  • Dotfuscator Software Services – Community Edition
  • Microsoft SQL Server Compact 3.5 SP1
  • SQL Server Compact Tools for Visual Studio 2010 Beta 1
  • Microsoft Sync Framework Runtime v1.0
  • Microsoft Sync Services for ADO.NET v2.0
  • Microsoft Sync Framework Services v1.0
  • Microsoft Sync Framework SDK v1.0
  • Microsoft SQL Publishing Wizard 1.4
  • SQL Server System CLR Types
  • Shared Management Objects
  • Microsoft SQL Server 2008 Express Edition

Wow.  This is going to take a while.

Installation Begins

You’ll have to reboot after the .NET Framework 4 installation.

Reboot Required

Go get a cup of coffee while the remaining components install..

Coffee Break

You’ll get a warning dialog, indicating that SQL Server 2008 has compatibility issues on Windows 7 and suggesting that you install SP1.

Compatibility

I just clicked the Run Program button and proceeded with the install.  A little bit later, I got a second compatibility warning dialog, also mentioning SQL Server 2008.  An external DOS window was also spawned, running a setup.exe command.

Compatibility #2

Finally, everything finishes up and we’re done!

Installation Complete

After the install completes, we get the main autorun window again and the link for checking for service releases is now active.

Autorun #2

If you click the Check for Service Releases link, you’ll be redirected to an update web page, which in turn allows firing up the Windows Update applet.  When I tried this (29 Jun 2009), no updates were found.

Finally, we bring up Visual Studio 2010 for the first time.

Splash Screen

As with earlier versions, when you start Visual Studio for the first time, you’re asked to choose a language, which dictates how the environment is set up.  I’m a C# guy.

When things finally start up, we see the new Start Page for the first time.

Start Page

The New Project dialog also gets a fresh look.

New Project

Finally, we create an empty WPF Application.

WPF Application

.NET Basics – Do Work in Background Thread to Keep GUI Responsive

One of the most important things that differentiates a “quick and dirty” application from one that has been designed well is how the application’s user interface behaves during lengthy operations.  The quick-and-dirty approach is to just do all of your work in a button’s Click event handler and not worry about the user interface.  The problem with this is that the GUI will freeze up while the application does whatever work it needs to do.

A well designed application, on the other hand, is one that is careful to do as much work as possible in background threads, keeping the GUI responsive and making sure that it makes it obvious to the user that work is going on in the background and adjusts the GUI to disallow any user actions that don’t apply until after the work finishes.

Under .NET 2.0, doing work on a background thread has become a lot easier, with the introduction of the BackgroundWorker class.  You no longer have to worry about cross-threading exceptions and checking a control’s InvokeRequired property.

A Simple Example of Using the BackgroundWorker Class

In this post, I’ll create a simple example of how you might use the BackgroundWorker class to do some work on a background thread and keep your GUI responsive.  We’ll start with a simple example that demonstrates how the GUI can become blocked and then evolve the application to make full use of the capabilities of the BackgroundWorker class.

Here are the basic players. We’ll have a FileReader class/object that reads text from a text file. And a Win Forms form with a button to initiate the file read operation and some GUI elements to display the status/results of the read operation.

Note: All code samples presented here can be found in CodePlex, at threadsafepubsub.codeplex.com

Iteration #1 – The Simplest Possible Solution

Let’s say that we just want to read a text file and return/display the number of lines found in the file. We can just make a call to our FileReader object, which returns the number of lines, and then display that number in our UI. Super simple.

This iteration is implemented in the files Form1.cs and FileReader1.cs.

Here’s what the GUI looks like.  If you click on the Read File button, you get a File Open dialog where you can select a file to read.  The file is read in and then we write out the # lines read, below the button.

001-Iter1Client

So far, so good.  This is how most simple user interfaces are written–you click on a button, which launches a Click callback, which does some work, and then returns to the caller.

Here’s what the FileReader1 class looks like, with a simple ReadTheFile method:

using System;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;

namespace ThreadSafePubSubUI
{
    public class FileReader1
    {
        // Read specified text file & return # lines
        public int ReadTheFile(string fileName)
        {
            int numLines = 0;

            using (StreamReader sr = new StreamReader(fileName))
            {
                string nextLine;
                while ((nextLine = sr.ReadLine()) != null)
                {
                    numLines++;
                }
            }

            return numLines;
        }
    }
}

And here’s the click event handler for the form: the guy that invokes ReadTheFile.

        private void btnSelect_Click(object sender, EventArgs e)
        {
            OpenFileDialog ofd = new OpenFileDialog();
            ofd.CheckFileExists = true;
            ofd.CheckPathExists = true;

            if (ofd.ShowDialog() == DialogResult.OK)
            {
                FileReader1 fr = new FileReader1();
                int numLines = fr.ReadTheFile(ofd.FileName);

                lblResults.Text = string.Format("We read {0} lines", numLines.ToString());
            }
        }

But what if the function that does the work takes a longer amount of time?  It’s pretty common for some action initiated by the user to take a little time.  What happens to the GUI while they are waiting?  We can simulate this by just adding a Thread.Sleep call in the ReadTheFile method.

            Thread.Sleep(3000);     // Simulate lengthy operation

Let’s also add a line in the btnSelect_Click method, to write a “busy” message to the GUI while we are processing.  Here is the updated click event handler:

        private void btnSelect_Click(object sender, EventArgs e)
        {
            OpenFileDialog ofd = new OpenFileDialog();
            ofd.CheckFileExists = true;
            ofd.CheckPathExists = true;

            if (ofd.ShowDialog() == DialogResult.OK)
            {
                lblResults.Text = " ... reading the file ...";
                FileReader1 fr = new FileReader1();
                int numLines = fr.ReadTheFile(ofd.FileName);

                lblResults.Text = string.Format("We read {0} lines", numLines.ToString());
            }
        }

What happens is not good.  Two bad things happen, from a user’s point of view:

  • The user interface is completely unresponsive during the file read operation
  • Our “reading the file” message is not displayed

What happened?  Well, because everything is on one thread, our user interface thread doesn’t respond to mouse clicks until ReadTheFile finishes.  Worse, even though we set the label’s Text property before we call ReadTheFile, the message loop doesn’t get a chance to process that change, and update the text, before we go out to lunch in ReadTheFile.

What we need to do to fix this is: do the file read operation on a different thread

The easiest way to do some work on a background thread, keeping the GUI responsive, is to use the BackgroundWorker class.

Iteration #2 – Using the BackgroundWorker Class

You should be doing very little actual work in GUI control event handlers like the Button.Click method.  It’s a good idea to:

  • Move code that does actual work outside of the user interface class
  • Do all work on a background thread.

We want to move code into a separate library or class, rather than having it in our Click event handler, to keep our user interface code separate from our functional code.  This is just a cleaner architecture and makes our code more maintainable, easier to test, and more extensible.

We also want to do as much work as possible on a different thread from the main thread handling the GUI.  If you do your work on the same thread, you risk locking up the user interface.  (As we saw in Iteration #1).

If you’re using the .NET Framework version 2.0 or later, the best way to do work on a background thread is to use the BackgroundWorker class.  This class gives us the ability to do some work on a background thread, provides progress and completed events “out of the box” and also ensures that these callbacks execute on the correct (GUI thread).

What do I mean by “execute on the correct thread”?  Here’s how it works.  To ensure that the GUI stays responsive, we want to do any non-trivial work on a background thread.  This thread can run in parallel to the GUI thread, so the user will still be able to interact with the GUI while the work is being done.

When the work finishes, we likely want to update something in the GUI to indicate this.  (E.g. change the text on a label to indicate that the operation is done).  Our GUI object will be notified by handling an event that the worker object fires.  But since we need to update the GUI, this event handler must be executing on the same thread as the user interface.

This last point is very important.  The core rule in Windows UI programming to remember is: the only thread that can update/change a user interface control is the thread that created it.  (This is true for Windows Forms applications, which use the Single Threaded Apartment model).

The beauty of the BackgroundWorker class is that it automatically handles all of this thread logic:

  • It does work on a background thread
  • It ensures that completed/progress events are fired on the original GUI thread

Let’s change our earlier file-reading example to use the BackgroundWorker.  This example can be found in the threadsafepubsub.codeplex.com project in the Form2/FileWorker2 classes.

Here’s the new Click event handler, where we create the background worker, attach our event handlers, and then tell it to go do some work.

private void btnSelect_Click(object sender, EventArgs e)
        private void btnSelect_Click(object sender, EventArgs e)
        {
            OpenFileDialog ofd = new OpenFileDialog();
            ofd.CheckFileExists = true;
            ofd.CheckPathExists = true;

            if (ofd.ShowDialog() == DialogResult.OK)
            {
                lblResults.Text = " ... reading the file ...";

                // Set up background worker object & hook up handlers
                BackgroundWorker bgWorker;
                bgWorker = new BackgroundWorker();
                bgWorker.DoWork += new DoWorkEventHandler(bgWorker_DoWork);
                bgWorker.RunWorkerCompleted += new RunWorkerCompletedEventHandler(bgWorker_RunWorkerCompleted);

                // Launch background thread to do the work of reading the file.  This will
                // trigger BackgroundWorker.DoWork().  Note that we pass the filename to
                // process as a parameter.
                bgWorker.RunWorkerAsync(ofd.FileName);
            }
        }

We first create an instance of the BackgroundWorker class and then wire up the DoWork and RunWorkerCompleted methods.  DoWork is the event that will fire when we call the RunWorkerAsync method.  And it will run asynchronously, in a background thread, freeing up the user interface.  Because RunWorkerAsync is launched in a background thread, control returns from the btnSelect_Click method quickly, and the UI is responsive, even while the file-read work is going on.

We also hook a handler to the RunWorkerCompleted event, which will fire when our bgWorker_DoWork method has finished doing the work.  This event, however, will execute on the original GUI thread–allowing is to update GUI elements directly within our gbWorker_RunWorkerCompleted handler.

Here’s the body of our DoWork handler.

        void bgWorker_DoWork(object sender, DoWorkEventArgs e)
        {
            FileReader2 fr = new FileReader2();

            // Filename to process was passed to RunWorkerAsync(), so it's available
            // here in DoWorkEventArgs object.
            string sFileToRead = (string)e.Argument;
            e.Result = fr.ReadTheFile(sFileToRead);
        }

Notice that we just use our earlier FileReader class to do the actual work of reading the file.  But there are two additions.

First, because this method is invoked from the BackgroundWorker object, we need to somehow get the name of the file to process.  We knew this filename back in the btnSelect_Click method and we hand it off by passing it as a parameter to RunWorkerAsync and then reading it out of the DoWorkEventArgs parameter.

Similarly, when we finish doing our work (reading the file), we need to make sure the result (# lines read) gets passed back to our RunWorkerCompleted handler.  We do this by setting the Result properly of the DoWorkEventArgs parameter.

Here’s the code for our RunWorkerCompleted event handler:

        void bgWorker_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
        {
            if (e.Error != null)
            {
                MessageBox.Show(e.Error.Message);
            }
            else
            {
                int numLines = (int)e.Result;
                lblResults.Text = string.Format("We read {0} lines", numLines.ToString());
            }
        }

Here we see the other side of the e.Result handoff–we read the FileReader.ReadTheFile return value out of the RunWorkerCompletedEventArgs parameter.  We also check this parameter to see if an error occurred.

If you now run this example, you’ll see a couple of important things that work better than they did in iteration #1:

  • We now correctly see the “reading the file” label, indicating that work is in progress
  • While the file is being read, we can interact with the GUI normally

You can demonstrate the second part of this by clicking on the “Tell Me a Joke” button.  You’ll get a message box with a clever joke and you can then dismiss the dialog–all while the file read operation is still going on.

Iteration #3 – Application State and Cancel Logic

You might be tempted at this point to think that we’re done and our application has everything that it needs.  But we’re missing a few critical things.  Any time that you do work in a background worker thread, you should also consider:

  • Busy indicator — making it easy for the user to know when work is being done in the background
  • Application state — what can/can’t the user do while the work is in progress?
  • Progress indicator — give the user a visual sense of how much work is left to be done
  • Cancel logic — optionally, give the user a method to cancel the background work

Busy Indicator

Let’s start with the busy indicator.  It’s important to make it obvious to your users that something is happening in the background, and what that something is.

Application State

We have some subtle behavior in our current implementation that is probably not desirable.  Try the following:

  • Click on the Read File button and select a file, to initiate a file read operation
  • Before the read has completed, click on the button again and select a new file

You now have two file read operations running in parallel.  Is this really what we want?  Do we want to prohibit it?  If not, how do we handle the results of two different file read operations, when the operations complete?  How do we avoid mixing up the results?  How do we know which operation the results are coming from?  Is there a chance that the two operations will attempt to work on/with the same data?

For our purposes, let’s agree that we really only want to allow the user to do one operation at a time.  While one operation is in progress, a user cannot initiate another one.  We’ll modify the GUI to enforce this.

Progress Indicator

More than just indicating that some work is going on in the background, it would be nice to indicate how much work we’ve already done and how much work is left to do.  This lets a user judge how long the entire process will take.

Cancel Logic

Whenever you support doing some work on a background thread, you also need to consider whether a user might want to cancel this background activity.  Unless it’s something that happens quite quickly, it’s probably a good idea to allow a user to cancel the operation and return to the original state (no file is being read and they are able to select a new file to be read).

At this point, it’s probably a good idea to do a rough sketch of a state diagram, showing what a user can do and during what state:

Application State Diagram

Notice that we enter the “reading file” state when the user clicks the “Read File” button.  But while in this state, the user cannot press that button again–they either press the “Cancel” button, or we return to the original state when the file read operation completes.

Also note that we should be able to display a joke while in either state.  This confirms what we said earlier–the GUI won’t lock up during the file read operation.

Our Modified Example

Here’s how our file reader example works, after adding a progress indicator, cancel logic, and the ability to keep track of application state.  Here’s the new GUI during a file read operation:

Progress

Note that we now tell the user what file we’re reading and we display a progress indicator, showing how far into the read operation we are.  We also give them a Cancel button, allowing them to Cancel the operation before it completes normally.  Also notice that the Read File button is greyed out—the user can’t initiate another operation until the first one completes.

If the user lets the file read operation complete normally, they’ll see the following:

Success

Notice that when we finish reading the file, returning to the Idle state, we hide all of the progress/cancel widgets.  The Read File button is also enabled again.

If the user cancels the file read operation, they’ll see the following:

Cancelled

Again, all of the progress/cancel widgets are gone, since we’re back in the Idle state.  And the Read File button is available again.  But this time, we tell the user that they cancelled the operation.

The code for this iteration can be found in threadsafepubsub.codeplex.com, as Form3.cs and FileReader3.cs.

We added a couple of things at the top of the class–an enumeration to keep track of our state, and a class-level BackgroundWorker instance.  (We move this variable into class scope because our Cancel button will need access to the BackgroundWorker object.

    private enum AppStates { Idle, ReadingFile };

    private BackgroundWorker _worker;

Here’s our new Form3 constructor, where we now call a method to set the initial application state:

        public Form3()
        {
            InitializeComponent();

            // Set up initial state
            SetAppState(AppStates.Idle, null);
        }

Here’s the actual code for the new SetAppState function, as well as a helper function that sets visibility for several controls.

        // Set new application state, handling button sensitivity, labels, etc.
        private void SetAppState(AppStates newState, string filename)
        {
            switch (newState)
            {
                case AppStates.Idle:
                    // Hide progress widgets
                    SetFileReadWidgetsVisible(false);
                    btnSelect.Enabled = true;
                    break;

                case AppStates.ReadingFile:
                    // Display progress widgets & file info
                    SetFileReadWidgetsVisible(true);
                    lblProgress.Text = string.Format("Reading file: {0}", filename);
                    pbProgress.Value = 0;
                    lblResults.Text = "";
                    btnSelect.Enabled = false;
                    break;
            }
        }

        private void SetFileReadWidgetsVisible(bool visible)
        {
            lblProgress.Visible = visible;
            pbProgress.Visible = visible;
            btnCancel.Visible = visible;
        }

We’re basically just changing the visibility of the various progress widgets in the StatusStrip at the bottom of the form.  We also handle enabling/disabling the File Read button here.

The Click event handler for our File Read button is also slightly different. We add a line that sets the application state to indicate that a file is being read, we attach a handler to track progress, and we add an exception handler to ensure that the application state is set back to idle if anything goes wrong.

        private void btnSelect_Click(object sender, EventArgs e)
        {
            OpenFileDialog ofd = new OpenFileDialog();
            ofd.CheckFileExists = true;
            ofd.CheckPathExists = true;

            if (ofd.ShowDialog() == DialogResult.OK)
            {
                FileInfo fi = new FileInfo(ofd.FileName);
                SetAppState(AppStates.ReadingFile, fi.Name);

                try
                {
                    // Set up background worker object & hook up handlers
                    _worker = new BackgroundWorker();
                    _worker.DoWork += new DoWorkEventHandler(bgWorker_DoWork);
                    _worker.RunWorkerCompleted += new RunWorkerCompletedEventHandler(bgWorker_RunWorkerCompleted);
                    _worker.WorkerReportsProgress = true;
                    _worker.WorkerSupportsCancellation = true;
                    _worker.ProgressChanged += new ProgressChangedEventHandler(bgWorker_ProgressChanged);

                    // Launch background thread to do the work of reading the file.  This will
                    // trigger BackgroundWorker.DoWork().  Note that we pass the filename to
                    // process as a parameter.
                    _worker.RunWorkerAsync(ofd.FileName);
                }
                catch
                {
                    SetAppState(AppStates.Idle, null);
                    throw;
                }
            }
        }

Note also that we have to explicitly tell the BackgroundWorker that it should support both progress and cancellation functionality.

We also now have a new event handler for the ProgressChanged event, which looks like this:

        // Get info on progress of file-read operation (% complete)
        void bgWorker_ProgressChanged(object sender, ProgressChangedEventArgs e)
        {
            // Just update progress bar with % complete
            pbProgress.Value = e.ProgressPercentage;
        }

This one is pretty simple—we just set the value of the progress bar, which runs from 0 to 100, to the reported % complete value.

Our DoWork handler has just a few changes.  Here is the new version:

        // Do work--runs on a background thread
        void bgWorker_DoWork(object sender, DoWorkEventArgs e)
        {
            // Note about exceptions:  If an exception originates anywhere in
            // this method, or methods that it calls, the BackgroundWorker will
            // automatically populate the Error property of the RunWorkerCompletedEventArgs
            // parameter that gets passed into the RunWorkerCompleted event handler.
            // So we can handle the exception in that method.

            FileReader3 fr = new FileReader3();

            // Filename to process was passed to RunWorkerAsync(), so it's available
            // here in DoWorkEventArgs object.
            BackgroundWorker bw = sender as BackgroundWorker;
            string sFileToRead = (string)e.Argument;

            e.Result = fr.ReadTheFile(bw, sFileToRead);

            // If operation was cancelled (triggered by CancellationPending),
            // we bailed out of ReadTheFile() early.  But still need to set
            // Cancel flag, because RunWorkerCompleted event will still fire.
            if (bw.CancellationPending)
                e.Cancel = true;
        }

I added a note to remind us that exceptions originating in this chunk of code (or on this thread) are automatically made available to us in the RunWorkerCompleted handler.

Notice also that we’re now passing the BackgroundWorker object into the ReadTheFile method.  We do this because we need access to it, within this message, to check for user cancellation and to report progress.

Finally, we see a piece of the cancellation infrastructure here.  Below is another code chunk to help us understand how the cancel operation works—the click handler for the Cancel button.

        private void btnCancel_Click(object sender, EventArgs e)
        {
            _worker.CancelAsync();
        }

This is pretty simple.  When the user clicks the Cancel button, we tell the BackgroundWorker object to initiate a cancel operation.  Here’s a summary of the entire cancel operation (what happens when):

  • User clicks Cancel button
  • btnCancel_Click handler invokes BackgroundWorker.CancelAsync on active worker object
  • Method doing actual work (reading file) periodically checks BackgroundWorker.CancellationPending and aborts if it sees this property set
  • Control returns to bgWorker_DoWork method
  • DoWork method checks CancellationPending property and sets DoWorkEventArgs.Cancel to true if operation was cancelled
  • BackgroundWorker.RunWorkerCompleted fires
  • We can check RunWorkerCompletedEventArgs.Cancelled, in our RunWorkerCompleted handler, to detect whether operation was cancelled

This is a little involved, but if you walk through the code, you’ll see how things work.

Finally, here is our RunWorkerCompleted event handler:

        void bgWorker_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e)
        {
            try
            {
                if (e.Error != null)
                {
                    MessageBox.Show(e.Error.Message, "Error During File Read");
                }
                else if (e.Cancelled)
                {
                    lblResults.Text = "** Cancelled **";
                }
                else
                {
                    int numLines = (int)e.Result;
                    lblResults.Text = string.Format("We read {0} lines", numLines.ToString());
                }
            }
            finally
            {
                // State now goes back to idle
                SetAppState(AppStates.Idle, null);
            }
        }

There are just a couple of new things here.  We now check the Cancelled property and display a message if the operation was cancelled.  We also add a finally block, where we ensure that we transition back to the Idle state, whether things completed normally, the user cancelled, or there was an error.

I have one final block of code to share—the ReadTheFile method that does the actual work:

        public int ReadTheFile(BackgroundWorker bw, string fileName)
        {
            int numLines = 0;
            FileInfo fi = new FileInfo(fileName);
            long totalBytes = fi.Length;
            long bytesRead = 0;

            using (StreamReader sr = new StreamReader(fileName))
            {
                // Note: When BackgroundWorker has CancellationPending set, we bail
                // out and fall back to the _DoWork method that called us.
                string nextLine;
                while (((nextLine = sr.ReadLine()) != null) &&
                       !bw.CancellationPending)
                {
                    bytesRead += sr.CurrentEncoding.GetByteCount(nextLine);
                    numLines++;
                    int pctComplete = (int)(((double)bytesRead / (double)totalBytes)* 100);
                    bw.ReportProgress(pctComplete);
                    Thread.Sleep(10);  // ms
                }
            }

            return numLines;
        }

We’ve basically added two things here: support for cancellation and for progress reporting.

To support user-initiated cancellation, we just check to see if the operation has been cancelled, after each line in the file that we read.  The frequency with which you check for cancellation is important.  If you don’t check often enough, the application will appear to not be responding to the cancel request and the user may become frustrated.

We report progress (% complete) by invoking the ReportProgress method on the background worker.  We do this after calculating the actual progress, in terms of # bytes read so far.

After Signing My Assembly, Why Do I Get Errors About Signing Referenced Assemblies?

This is a note-to-self quickie blog post.

I’m in the process of deploying a VSTO solution that includes two DLLs–a data access layer DLL and an Excel Workbook (VSTO) project that contains that code-behind for the actual Excel workbook.  In my case, my Excel code creates controls that live on the Action Pane in Excel and allow the user to interact with pre-created graphs that are fed data from a database.

When you deploy a VSTO solution, you need to grant full trust to the class library associated with the Excel Workbook (or other Office document).  This in turn means that you need to sign your assembly, i.e. attach a strong name to it.

When I sign my main assembly (e.g. ExcelWorkbook1.dll), and try to build it, I now get the following error:

Error    1    Assembly generation failed — Referenced assembly ‘MyDataAccessLayer’ does not have a strong name

What’s going on here is that when you sign an assembly, all referenced assemblies must now also have strong names.  Let’s say that again, the rule to remember is:

All assemblies referenced by a strong-named assembly must also have a strong name.

This makes sense, when you think about security concerns.  The purpose of signing an assembly is to prevent someone from replacing your assembly with one that has the same API, but does something bad–i.e. spoofing it.  Signing your main assembly helps, but if it then references a weakly-named (not signed) assembly, someone could spoof that assembly and still make your assembly behave badly.  That’s a security hole.

So when you think about signing your assemblies, giving them strong names, remember that it’s a domino effect–you’ll need to (and want to) sign all of your assemblies.  And any third-party assemblies that you use/reference need to also have strong names.

A Simple .NET Twitter API Wrapper Using LINQ

In the world of software demos, doing something with Twitter has replaced Hello World as the most common target of a demo.  At the risk of polluting the world with yet another chunk of code that does something with Twitter–I’d like to play around a bit with Silverlight charting and Twitter seems a great context for demoing what is possible.

But before I can start creating a Silverlight demo, I need a basic Twitter API wrapper in .NET.  So here’s a starting point–a simple example that uses LINQ to get a list of people that you follow.  This is a good starting point for later demos.

Twitter provides a simple REST API that lets you do basically everything you’d want to do using simple HTTP GET, POST and DELETE requests.

You can learn everything you need to know about the Twitter API at the Twitter API Wiki.

Basic Concepts

I’ll assume that you generally know how Twitter works–you follow some folks, some folks follow you, and you all post status messages–which your followers can read.  That’s the beauty of Twitter–pretty simple.

But here are some things that you should know about the Twitter API.

  • How it works
    • You post an HTTP request to a URL
    • You get XML data back in the HTTP response
  • Authentication
    • Some API method calls require authentication, using HTTP Basic Authentication.
    • Any app invoking calls that require authentication will need to supply the proper credentials.
  • Rate limits
    • Your app is limited to 100 requests per hour.    (whether you’re authenticating or not)
    • You can receive a special dispensation from the Twitter gurus to be allowed up to 20,000 requests/hr.
  • Paging
    • Many API methods requires multiple requests, retrieving a page of data at a time
    • The page parameter allows you to specify which page of data to retrieve
    • The count parameter allows specifying # items per page

What happens when you hit your rate limit?  Well, basically your application (your IP address, actually) can no longer make requests of Twitter–until the rate limit resets.

The API Calls That I Use

Here are the two Twitter API calls (URLs) that I use in this example:

You can see how these work by just entering the above URLs into your browser and looking at the XML data stream that comes back.

Here’s an example of the data returned by the friends call:

Output of Friends Request

An here’s an example of the data returned from the users call:

Data Returned from Users Request

The Peep Class

Let’s start building a simple Twitter API wrapper in .NET with a very simple class to encapsulate information about a single user–either yourself, a follower, or someone that you follow.  This doesn’t cover absolutely everything that we can find out about a Twitter user, but encapsulates some of the basic stuff that we care about.

(For these examples, I’m using Visual Studio 2008 — C# 3.0).

Here’s the code for the Peep class:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace TwitterLibGUI
{
    public class Peep
    {
        public string ScreenName { get; set; }
        public string Name { get; set; }
        public string Location { get; set; }
        public string Description { get; set; }
        public Uri ProfileImageURL { get; set; }
        public Uri URL { get; set; }
        public int ID { get; set; }

        public int NumFollowers { get; set; }
        public int NumFollowing { get; set; }
        public int NumUpdates { get; set; }

        public string LastUpdateText { get; set; }
        public DateTime LastUpdateDateTime { get; set; }

        public new string ToString()
        {
            return string.Format("{0} - {1}", ScreenName, Name);
        }
    }
}

Super simple class, made much easier through the user of C#’s automatic properties.  As an example, using me as a Twitter user, my ScreenName would be “spsexton” and my Name would be “Sean Sexton”.

The Peeps Class

Now that we have an object that wraps a “peep”, let’s create a special class that represents a collection of Peep instances–or “peeps”.  For example, an instances of Peeps could be used to store a list of everyone that we follow (or everyone that follows us).

We’ll use our old friend, the List(T) class, from System.Collections.Generic, which implements a strongly typed collection.

Basically, a collection of Peep objects will look like this:  List<Peep>.  But we’ll create a subclass so that we can add a static method for building up a list of everyone that we follow.

Here’s the full code for Peeps.cs, followed by an explanation of how we do things.

using System;
using System.Collections.Generic;
using System.Globalization;
using System.Linq;
using System.Net;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace TwitterLibGUI
{
    /// <summary>
    /// Peep collection class
    /// </summary>
    public class Peeps : List<Peep>
    {
        // Partial Twitter API
        private const string getFriendsURI = "http://twitter.com/statuses/friends/{0}.xml?page={1}";

        /// <summary>
        /// Return list of Peeps followed by a specified person
        /// </summary>
        ///
<param name="ScreenName">The Twitter username, e.g. spsexton</param>
        /// <returns></returns>
        public static Peeps PeopleFollowedBy(string ScreenName, out int RateLimit, out int LimitRemaining)
        {
            if ((ScreenName == null) || (ScreenName == ""))
                throw new ArgumentException("PeopleFollowedBy: Invalid ScreenName");

            int nPageNum = 1;
            int userCount = 0;      // # users read on last call

            int rateLimit = 0;          // Max # API calls per hour
            int limitRemaining = 0;     // # API calls remaining

            XDocument docFriends;
            Peeps peeps = new Peeps();

            // Retrieve people I'm following, 100 people at a time
            // (each call to Twitter API results in one "page" of results--up to 100 users)
            try
            {
                do
                {
                    // Example of constituting XDocument directly from the URI
                    // docFriends = XDocument.Load(string.Format(getFriendsURI, ScreenName, nPageNum));

                    // Manually create an HTTP request, so that we can pull information out of the
                    // headers in the response.  (Then later constitute the XDocument).
                    HttpWebRequest req = (HttpWebRequest)WebRequest.Create(string.Format(getFriendsURI, ScreenName, nPageNum));
                    HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
                    TwitterUtility.GetInfoFromResponse(resp, out rateLimit, out limitRemaining);
                    XmlReader reader = XmlReader.Create(resp.GetResponseStream());
                    docFriends = XDocument.Load(reader);

                    IEnumerable<XElement> users = docFriends.Elements("users").Elements("user");

                    userCount = users.Count();
                    if (userCount > 0)
                    {
                        List<Peep> nextPage = (from user in users
                                               orderby (string)user.Element("screen_name")
                                               select new Peep
                                               {
                                                   ID = (int)user.Element("id"),
                                                   ScreenName = (string)user.Element("screen_name"),
                                                   Name = (string)user.Element("name"),
                                                   Location = (string)user.Element("location"),
                                                   Description = (string)user.Element("description"),
                                                   ProfileImageURL = TwitterUtility.UriFromString((string)user.Element("profile_image_url")),
                                                   URL = TwitterUtility.UriFromString((string)user.Element("url")),
                                                   NumFollowers = (int)user.Element("followers_count"),
                                                   LastUpdateDateTime = TwitterUtility.SafeUpdateDateTime(user.Element("status")),
                                                   LastUpdateText = TwitterUtility.SafeUpdateText(user.Element("status"))
                                               }).ToList();

                        peeps.AddRange(nextPage);
                    }
                    nPageNum++;
                } while (userCount > 0);
            }
            catch (WebException xcp)
            {
                throw new ApplicationException(
                    string.Format("Twitter rate limit exceeded, max of {0}/hr allowed. Remaining = {1}",
                        rateLimit,
                        limitRemaining),
                    xcp);
            }
            finally
            {
                RateLimit = rateLimit;
                LimitRemaining = limitRemaining;
            }

            return peeps;
        }
    }
}

Notice that all we have in the Peeps class at this point is a static method, PeopleFollowedBy, that returns a collection of Peep objects, one for each person that the specified screen name follows.

The first thing that you’ll notice about the code is a loop where we get consecutive pages from the Twitter \statuses\friends\screenname.xml page.  By default, you get only 100  users at a time when invoking this URL.  So the easiest way to get all people that someone follows (their “friends”), is to request consecutive pages until you get one back that contains no users.

At each step through the loop, we construct a List<Peep> object from the XML results and then add that collection to a master collection (which this function will return).

Before we look at the code at the top of the loop that constructs an HTTP request to get the next page, take a look at the commented out line at the top of the loop:

                    // Example of constituting XDocument directly from the URI
                    // docFriends = XDocument.Load(string.Format(getFriendsURI, ScreenName, nPageNum));

This is actually the simplest way to get the results of the Twitter API calll into an XDocument, and ready for querying.  Using this single line, you could replace the next five lines of code that end with another XDocument.Load.  What’s going on here is the core of what we want to do–load up an XDocument from the URI that represents the Twitter API call.

But in my code, I go to a little more effort to create an HttpWebRequest and then get the HttpWebResponse for that request.  I do this solely for the purpose of getting Twitter rate limit information out of the header of the response.  If you’ll recall, Twitter has a 100 calls per hour rate limit by default.  The nice thing is that the API tells us the rate limit, as well as the # calls remaining, after each request.  So we read that from the header and keep track of it.

For now, this rate limit information is just returned to the caller.  But my intent is to use it in a future example to actually slow down my Twitter calls, as needed.  This will be helpful when we want to batch up a large # of Twitter calls, but we don’t want to risk maxing out our rate limit.  More on this later.

I get the rate limit info from the header in the TwitterUtility.GetInfoFromResponse method, described below.

Here’s Where LINQ Comes In

Now for the LINQ part.  Once we load up the XDocument from Twitter’s response, we can build up a collection of XElement objects corresponding to the list of users in the XML stream.  But this isn’t quite what we want..  To get the data from the XElement objects into our List<Peep> collection, we need to do a simple LINQ query.

(Thanks to a post by Wally McClure on the basic idea for the LINQ query: Calling the Twitter API in C#).

The LINQ query is pretty simple–we grab each user element from the XML stream and create a Peep object for that user.  We initialize all the fields of the new Peep object for which we can get data from this XML stream.  (We can’t get NumFollowing or NumUpdates–we’ll have to make a different API call to get that information).

In most cases, we’re asking for the value of an XML element that is a child of the <user> element.  E.g. the <id> element.  And we call helper methods in some cases, since the elements that we’re trying to read might by null.  (Actually, I haven’t tested this thoroughly–some of the other elements might occasionally be null and so it wouldn’t be a bad idea to use a “safe” accessor method on all of the elements).

Finally, we need to convert the result of our query–which is IEnumerable<Peep>–to a List<Peep> by calling the ToList() method.  Then we add this new list to the master list that we are building.

Handling the Rate Limit Exception

One final thing remains for this function–handling the case when we exceed our rate limit.  I added a simple handler, to make it a little more obvious to the client that we’ve exceeded our rate limit, rather than letting the underlying WebException bubble up.  This is a little bit sloppy, since there are other things that might throw a WebException.  But this is a good start at giving the caller a little info on the rate limit issue.

The Helper Class

Here is the full code for the TwitterUtility class, which just contains a handful of helper methods that we make use of in the Peeps and Peep (see below) classes.

using System;
using System.Collections.Generic;
using System.Globalization;
using System.Linq;
using System.Net;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace TwitterLibGUI
{
    /// <summary>
    /// Various global utility methods for Twitter library
    /// </summary>
    public class TwitterUtility
    {
        /// <summary>
        /// Convert a string to a valid Uri object (or null)
        /// </summary>
        ///
<param name="sUri">String represent Uri, e.g. http://blahblah.com</param>
        /// <returns></returns>
        public static Uri UriFromString(string sUri)
        {
            return ((sUri != null) && (sUri.Length > 0)) ? new Uri(sUri) : null;
        }

        /// <summary>
        /// Pull a couple fields out of the header--specifically, the Twitter API rate limit info.
        /// </summary>
        ///
<param name="resp"></param>
        ///
<param name="rateLimit"></param>
        ///
<param name="limitRemaining"></param>
        public static void GetInfoFromResponse(WebResponse resp, out int rateLimit, out int limitRemaining)
        {
            rateLimit = 0;
            limitRemaining = 0;

            for (int i = 0; i < resp.Headers.Keys.Count; i++)
            {
                string s = resp.Headers.GetKey(i);
                if (s == "X-RateLimit-Limit")
                {
                    rateLimit = int.Parse(resp.Headers.GetValues(i).First());
                }
                if (s == "X-RateLimit-Remaining")
                {
                    limitRemaining = int.Parse(resp.Headers.GetValues(i).First());
                }
            }
        }

        /// <summary>
        /// Parse twitter date string into .NET DateTime
        /// </summary>
        ///
<param name="dateString"></param>
        /// <returns></returns>
        public static DateTime ParseTwitterDate(string dateString)
        {
            return DateTime.ParseExact(dateString, "ddd MMM dd HH:mm:ss zzz yyyy", CultureInfo.InvariantCulture);
        }

        /// <summary>
        /// Return a valid DateTime for status.created_at and handle the case
        /// of the status element not being present.
        /// </summary>
        ///
<param name="user">Represents status element (child of user element)</param>
        /// <returns></returns>
        public static DateTime SafeUpdateDateTime(XElement user)
        {
            DateTime creAt = new DateTime();        // Default constructor is 1/1/0001 12AM

            if (user != null)
            {
                XElement elemCreAt = user.Element("created_at");
                if (elemCreAt != null)
                {
                    creAt = ParseTwitterDate((string)elemCreAt);
                }
            }

            return creAt;
        }

        /// <summary>
        /// Return a valid update text string, whether or not the <status> element
        /// was present.
        /// </summary>
        ///
<param name="user">Represents status element (child of user element)</param>
        /// <returns></returns>
        public static string SafeUpdateText(XElement user)
        {
            string sText = "";

            if (user != null)
            {
                XElement elemText = user.Element("text");
                if (elemText != null)
                {
                    sText = (string)elemText;
                }
            }

            return sText;
        }
    }
}

Here’s what’s in this class:

  • UriFromString — “Safe” assignment, creating either a valid Uri object, or null
  • GetInfoFromResponse — Read the Headers collection from the HTTP response to pull out the rate limit and # remaining API calls
  • ParseTwitterDate — Parse the funky Twitter date/time string into a DateTime object
  • SafeUpdateDateTime — Another “safe” method, filling in a DateTime object only if the created_at element exists
  • SafeUpdateText — And a “safe” assignment from the text element

(Note: Both the <created_at> and <text> elements are under the <status> element).

NumFollowing and NumUpdates

My goal when I started throwing together this example was to fully populate the Peep class that I listed at the top of the post.  This includes not just # of followers for everybody in my “friends” list, but the # of people that they follow, and their total updates.  I can get everything from the API call that we just saw–the “friends” call.  But to get # following and # updates, I need to make a different call:

http://twitter.com/users/show.xml?screen_name=screenname

This method returns a bunch of info about a particular user, including # following and # updates.  (See the XML output at the top of the post).

So the obvious thing to do would be to add an assignment in our LINQ query, calling a helper method to go off and call this other Twitter API method, right?  For each user, we could call show.xml and get the remaining two fields.

The problem with including this 2nd Twitter call in the LINQ is that we’ll blow out our Twitter rate limit.  We only get 100 requests per hour, so we’d run out of steam trying to flesh out the first 100 users.  (And any Twitter user worth his salt follows at least 100 people).

So what is to be done?  For now, I add code to the Peep class (see below) to get the remaining info on a “peep by peep” basis, rather than getting everything all at once.  This is a bit of a cop out, since we leave it up to the client to decide how often to call this method.

I’ll do a 2nd post where I actually add code to make these additional calls, but in a “rate limit safe” manner.  (Hint–we’ll use timers to slow down our use of the Twitter API).

So until we get some “rate limit smart” code, here’s the expanded code for Peep.cs, including a method that calls show.xml to get the additional info.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace TwitterLibGUI
{
    public class Peep
    {
        private const string userInfoURI = "http://twitter.com/users/show.xml?screen_name={0}";

        public string ScreenName { get; set; }
        public string Name { get; set; }
        public string Location { get; set; }
        public string Description { get; set; }
        public Uri ProfileImageURL { get; set; }
        public Uri URL { get; set; }
        public int ID { get; set; }

        public int NumFollowers { get; set; }
        public int NumFollowing { get; set; }
        public int NumUpdates { get; set; }

        public string LastUpdateText { get; set; }
        public DateTime LastUpdateDateTime { get; set; }

        public new string ToString()
        {
            return string.Format("{0} - {1}", ScreenName, Name);
        }

        /// <summary>
        /// Calculate NumFollowing & NumUpdates, since these two fields'
        /// data isn't available from an API call that gets a list of
        /// multiple users, but must be retrieve for each user individually.
        /// </summary>
        public void CalcAddlInfo()
        {
            if ((ScreenName == null) || (ScreenName == ""))
                throw new ArgumentException("CalcNumFollowing: Invalid ScreenName");

            int rateLimit = 0;          // Max # API calls per hour
            int limitRemaining = 0;     // # API calls remaining

            XDocument docUser;

            try
            {
                HttpWebRequest req = (HttpWebRequest)WebRequest.Create(string.Format(userInfoURI, ScreenName));
                HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
                TwitterUtility.GetInfoFromResponse(resp, out rateLimit, out limitRemaining);
                XmlReader reader = XmlReader.Create(resp.GetResponseStream());
                docUser = XDocument.Load(reader);

                XElement user = docUser.Element("user");

                NumFollowing = (int)user.Element("friends_count");
                NumUpdates = (int)user.Element("statuses_count");
            }
            catch (WebException xcp)
            {
                throw new ApplicationException(
                    string.Format("Twitter rate limit exceeded, max of {0}/hr allowed. Remaining = {1}",
                        rateLimit,
                        limitRemaining),
                    xcp);
            }

        }

        /// <summary>
        /// Variant that just takes screen name, rather than acting on existing
        /// instance.
        /// </summary>
        ///
<param name="ScreenName"></param>
        public void CalcAddlInfo(string screenName)
        {
            Peep p = new Peep { ScreenName = screenName };
            p.CalcAddlInfo();
        }
    }
}

The CalcAddlInfo method just invokes the show.xml API call and then reads the friends_count and statuses_count fields.

Hey, Where’s My GUI?

Ok, so at this point, we have the following code chunks:

  • Peep.cs — wraps data for a single user and gives us method to get a few additional fields
  • Peeps.cs — subclasses List<Peep> and gives us method to get list of people that we follow
  • TwitterUtility.cs — some miscellaneous helper functions

Now let’s throw a simple Win Forms GUI on top of these classes, so that we can test things out.  Here’s what the final result will look like, after calling our PeopleFollowedBy method:

ourgui

The code for this couldn’t be simpler.  I just call the Peeps.PeopleFollowedBy method, which returns an instance of the Peeps class (which is really a List<Peep>).  And then I bind the collection to a DataGridView.  Presto.

(If you’re paying attention, you’ll also notice that my rate limit is listed as 20,000/hr, rather than the default 100/hr.  This is because I requested “white list” status and the Twitter crew kindly consented to bump my rate limit.  This applies whenever I’m making API calls from my specific IP address).

For what it’s worth, here’s the event handler code for the Load Grid button in the GUI.  Notice that I also make a test call to the CalcAddlInfo method–so we can step through the call in the debugger and see how it works.

        private void btnLoadGrid_Click(object sender, EventArgs e)
        {
            int rateLimit;
            int limitRemaining;

            Cursor = Cursors.WaitCursor;
            Peeps peeps = Peeps.PeopleFollowedBy(txtScreenName.Text, out rateLimit, out limitRemaining);

            lblNumFollowing.Text = peeps.Count.ToString();
            lblRateLimit.Text = rateLimit.ToString();
            lblRemaining.Text = limitRemaining.ToString();

            peeps[5].CalcAddlInfo();

            dgvPeeps.DataSource = peeps;
            Cursor = Cursors.Default;
        }

Wrapping Up and Next Steps

That’s all there is to it–the process of making calls to the Twitter API from .NET code and consuming the resulting XML data using LINQ is pretty straightforward.

Where am I headed next, on my way to doing some Silverlight demos?  Here’s what I’ll cover in the next post:

  • Making my API methods smart about rate limits, using timers to acquire Twitter data quietly in the background–at a rate that is just slow enough to not trigger the rate limit.
  • Possibly caching the data on the client

If we were going to productize the code that I’ve presented, we’d also want to think about:

  • Moving the Twitter API stuff out of the data objects and into a separate class
  • Better exception handling
  • Wrapping the entire Twitter API, rather than just a couple methods

Hello WPF World, part 2 – Why XAML?

Let’s continue poking around with a first WPF “hello world” application.  We’ll continue comparing our bare bones wizard-generated WPF project with an equivalent Win Forms application.  And we’ll look at how XAML fits into our application architecture.

Last time, we compared the Win Forms Program class with its parallel in WPF–an App class, which inherits from System.Windows.Application.  The application framework in Win Forms was pretty lightweight–we just had a simple class that instantiated a form and called the Application.Run method.  WPF was just a bit more complicated.  If we count the generated code, we have an App class split across a couple of files, as well as a .xaml file that defines applicaton-level properties (like the startup window).

Now let’s compare the main form in our Win Forms application with the main window generated for us in WPF.  (The fact that WPF calls it a window, rather than a form, hints at the idea that GUI windows aren’t meant to be used just for entering data in business applications).

In Windows Forms, we have two files for each form–the form containing designer-generated code (e.g. Form1.Designer.cs) and the main code file where a user adds their own code (e.g. Form1.cs).  These two source files completely define the form and are all that’s required to build and run your application.  In Windows Forms, the designer renders a form in the IDE simply by reading the Form1.Designer.cs file and reconstructing the layout of the form directly from the code.  (The IDE does create a Form1.resx resource file, but by default your form is not localizable and the resource file contains nothing).

When you think about it, this approach is a bit kludgy.  The designer is inferring the form’s layout and control properties by parsing the code and reconstructing the form.  Form1.Designer.cs is meant to contain only designer-generated code, so with partial classes, we can keep designer-generated code in a single file and it only contains designer code.  But it’s clumsy to use procedural code to define the static layout of a form.

Here’s a picture of how things work in Win Forms:

In this model, the Form1.Designer.cs file contains all the procedural code that is required to render the GUI at runtime–instantiation of controls and setting their properties.  We could dispense with the designer in Visual Studio—it’s just a convenient tool for generating the code.  (I’m ashamed to admit that I’ve worked on projects that broke the designer and everyone worked from that point on only in the code–ugh)!

Now let’s look at WPF.  Here’s a picture of what’s going on:

Note the main difference here is–our designer works with XAML, rather than working with the code.  This is the big benefit of using XAML–that the tools can work from a declarative specification of the GUI, rather than having to parse generated code.  This also means that it’s easier to allow other tools to work with the same file–e.g. Expression Blend, or XamlPad.

Then at build time, instead of just compiling our source code, the build system first generates source code from the XAML file and then compiles the source code.

But this isn’t quite the whole story.  It’s not the case in WPF that the Window1.g.cs file contains everything required to render the GUI at runtime.  If we look at the Window1.g.cs file, we don’t find the familiar lines where we are setting control properties.  Instead, we see a call to Application.LoadComponent, where we pass in a path to the .xaml file.  We also find a very interesting method called Windows.Markup.IComponentConnector.Connect(), which appears to be getting objects passed into it and then wiring them up to private member variables declared for each control.  If we add a single button to our main window, the code looks something like:

But then the obvious question is–what happened to all those control properties?  Where do the property values come from at runtime?

Enter BAML–a binary version of the original XAML that is included with our assembly.  Let’s modify the above picture to more accurately reflect what is going on:

Note the addition–when we build our project, the contents of the XAML file–i.e. a complete definition of the entire GUI–is compiled into a BAML file and stored in our assembly.  Then, at runtime, our code in Window1.g.cs simply loads up the various GUI elements (the logical tree) from the embedded BAML file.  This is done by the Connect method that we saw earlier, in conjunction with a call to Application.LoadComponent:

MSDN documentation tells us, for LoadComponent, that it “loads a XAML file that is located at the specified uniform resource identifier (URI) and converts it to an instance of the object that is specified by the root element of the XAML file”.  When we look at the root element of the XAML file for our application, we discover that it is an object of type Window, with the specific class being HelloWPFWorld.Window1.  Voila!  So we now see that the code in Window1.g.cs which was generated at build time just contains an InitializeComponent method whose purpose it is to reconstitute a Window and all its constitutent controls from the GUI definition in the XAML file.  (Which went along for the ride with the assembly as compiled BAML).

So what is BAML and where is it?  BAML (Binary Application Markup Language) is nothing more than a compiled version of the corresponding XAML.  It’s not procedural code of any sort–it’s just a more compact version of XAML.  The purpose is just to improve runtime performance–the XAML is parsed/compiled at build time into BAML, so that it does not have to be parsed at runtime when loading up the logical tree.

Where does this chunk of BAML live?  If you take a look at our final .exe file in ILDASM, you’ll see it in the manifest as HelloWPFWorld.g.resources.  Going a tiny bit deeper, the Reflector tool shows us that HelloWPFWorld.g.resources contains something called window1.baml, which is of type System.IO.MemoryStream.  (I found something that indicated there was also a BAML decompiler available from the author of Reflector, which would allow you to extract the .baml from an assembly and decompile back to .xaml–but I couldn’t find the tool when I went looking for it).

So there you have it.  We haven’t quite yet finished our “hello world” application, but we’re close.  We’ve now looked in more depth at the structure of the application and learned a bit about where XAML fits into the picture.  Next time, we’ll add a few controls to the form and talk about how things are rendered.

It’s a WPF World, part 2

Let me continue my ramble about Microsoft technologies leading up to WPF.  Last time, I ended by talking about the .NET technologies and why I think they are so important.  .NET has become the de facto standard for developing applications for the Windows platform (thick clients).  And although ASP.NET likely doesn’t have nearly as big a chunk of market share as Windows Forms, it feels like the WISA stack (Windows, IIS, SQL Server, ASP.NET) is gradually overtaking the LAMP stack (Linux, Apache, MySQL, PHP).  And with the rise of RIAs (Rich Internet Applications), ASP.NET Ajax will likely encourage the continued adoption of the ASP.NET technologies.

Going back to my list of the most important benefits of the .NET world from last time, I realized that I’d like to add a final bullet item.  (In the list–what’s so special about .NET):

  • Programmer Productivity — with things like intellisense and code snippets in Visual Studio, you can be incredibly productive, whether working in Win Forms or ASP.NET

I make the claim about productivity without having had experience with other development environments (at least since doing OWL development with Borland tools).  But the rise in productivity is true even just of Microsoft tools.  They just continue to get better and better.  And though “real programmers” might pooh-pooh all this intellisense nonsense in favor of hand coding in Textpad, I have to believe that even these guys would be more productive if they truly leveraged the tools.

When .NET first came out, I remember reading the marketing materials and being a little misled about where ASP.NET fit into things.  It seemed like Microsoft was touting convergence of Windows vs. web development, by talking up the similarities of the dev experience, working in Windows Forms vs. Web Forms.  Developing with ASP.NET was ostensibly very similar to developing Win Forms applications–you started with an empty design surface, dragged visual controls onto it, and double-clicked to bring up an editor where you wrote your code-behind.  (Nevermind that this model encouraged low quality architectures as people wrote monolithic chunks of event handler code–perpetuating bad habits that we learned with VB6).

But the web model was still very, very different from the thick client model.  On Windows, we were still working with an event-driven model, using the same old Windows message loop that we’d always used.  But users interact with a web-based interface in a completely different way.  We used server-side controls in ASP.NET to hide some of the complexity, but we were still just delivering a big glob of HTML to the client and then waiting for a new HTTP request.

The ASP.NET development environment also felt a bit kludgy.  I remember being a bit dismayed when I wrote my first ASP.NET application.  In the classic Win Forms development environment, there were two separate views of your application, where a developer lived–the design surface and the code behind.  The ASP.NET environment has three–the design surface, the code behind, and the HTML content.  So now instead of a developer jumping back and forth between two views, you end up jumping around in all three.

Spoiler–this web application architecture gets a lot cleaner with Silverlight 2.0.  There seems to be actual convergence between thick vs. thin clients as we use WPF for thick clients, Silverlight 2.0 for thin.  But more on that later.

So along comes WPF (Windows Presentation Foundation) and XAML (Extensible Application Markup Language).

I remember when I first read about XAML and WPF (Avalon at the time).  My first reaction was to be mildly frustrated.  At first glance, XAML seemed to be an arbitrary switch to a new syntax for defining GUIs.  And it seemed cryptic and unnecessary.  In the effort to be able to define a user interface declaratively rather than procedurally, it looked like we were ending up with something far messier and garbled than it needed to be.  It felt much easier to understand an application by reading the procedural code than by trying to make sense of a cryptic pile of angle brackets.

But I’ve come to realize that the switch to defining GUIs declaratively makes a lot of sense.  With XAML, thick clients move a bit closer to web applications, architecturally–the GUI is built from a static declaration of the UI (the XAML), a rendering engine, and some code behind.  And the layout of a user interface (rather than the behavior) is implicitly static, so it makes sense to describe it declaratively.  [As opposed to Windows Installer technology, which converts the dynamics of the installation process to something declarative and ends up being far messier].

Why does it make sense to separate the markup from the code like this?

  • Architecturally cleaner, separating the what (XAML declaraction of GUI) from the how (code-behind).  (Separating the Model from the View)
  • Enables broad tool support–tools can now just read from and write to XAML, since the GUI is now defined separately from the application itself.  (Enabling separate tools for designers/devs, e.g. Expression Blend and Visual Studio).
  • Cleaner for web apps — because we can now serialize a description of the GUI and send it across the wire.  This just extends the ASP.NET paradigm, using a similar architecture, but describing a much richer set of elements.

Another of my earlier reactions to XAML was that it was just a different flavor of external resource (.resx) files.  It looked similar–using angle brackets to describe GUI elements.  But XAML goes far beyond .resx files.  Resource files are used to externalize properties of UI controls that are potentially localizable.  E.g. Control sizes, locations and textual elements.  But the structure is flat because a resource file is nothing more than a big collection of keyword/value pairs.  Nothing can be gleaned about the structure of the UI itself by looking at the .resx file.  XAML, on the other hand, fully describes the UI, hierarchically.  It is far more than a set of properties, but it is logically complete, in that it contains everything required to render the GUI.

XAML is a big part of what makes WPF so powerful.  But there are a number of other key features that differentiate WPF from Windows Forms.

  • Totally new rendering engine for the GUI, based on Direct3D.  This enables better performance in rendering of the GUI, because everything in your GUI is described as 3D objects.  So even apparent 2D user interfaces can take advantage of hardware acceleration on the graphics card.  [The new NVidia GT200 GPU has 1.4 billion transistors, compared with original Core 2 Duo chips, which were in the neighborhood of 300 million].
  • Vector graphics.  The GUI is now entirely defined in vector graphics, as opposed to bitmapped/raster.  This is huge, because it means that you can define the GUI geometry indepedent of the target machine’s screen resolution.  This should mean–no more hair-pulling over trying to test/optimize at various screen resolutions and DPI settings.  (Gack)!
  • Bringing other APIs into the .NET fold.  E.g. 3D, video, and audio are now accessible directly from the .NET Framework, instead of having to use APIs like DirectX.
  • Focus on 3D graphics.  All of the above technologies just make it easier to develop stunning 3D graphical user experiences.  Powerful/cheap graphics hardware has led to 3D paradigms like Apple’s “cover flow” showing up more and more often in the average user interface.  (Battleship Grey, you will not be missed))

So where does WPF fit into the rest of the .NET world?  Is WPF a complete replacement for Windows Forms?

Adopting WPF will not be nearly as big a learning curve as adopting .NET was, originally.  WPF is a full-fledged citizen of the .NET world, existing as a series of .NET namespaces.  So .NET continues to be the premiere Microsoft development technology.

WPF is definitely a replacement for Windows Forms.  It is the new presentation layer, meant to be used for creation of new Windows-based applications (thick clients).  Microsoft is probably hesitant to brand WPF purely as a Windows Forms replacement, not wanting to dismay development shops that have invested a lot in learning .NET and Windows Forms.  But it’s clearly the choice for new Windows-based UI development, especially as the number of WPF controls provided by the tools, and shipped by 3rd party vendors, increases.

WPF is also highly relevant to web development (thin clients).  Silverlight 2 allows a web server to deliver browser-based RIAs containing a subset of the widgets found in WPF.  (Silverlight 2 used to be called Windows Presentation Foundation/Everywhere).  With WPF and Silverlight, Windows and web development are definitely converging.

So we clearly now live in a WPF world.  WPF will rapidly become more widely adopted, as it is used for more and more line-of-business applications, as well as serving as the underlying engine for the new Silverlight 2 RIAs that are starting to appear.  And the best news is that we also still live in a .NET world.  We get all of the .NET goodness that we’ve learned to love, with WPF being the shiniest new tool in our .NET toolbox.