PDC 2008, Day #4, Session #1, 1 hr 30 mins
Gianpaolo Carraro
As the last day of PDC starts, I’m down to four sessions to go. I’ll continue doing a quick blog post on each session, where I share my notes, as well as some miscellaneous thoughts.
The Idea of a Symposium
Gianpaolo started out by explaining that they were finishing PDC by doing a pair of symposiums, each a series of three different sessions. One symposium focused on parallel computing and the other on cloud-based services. This particular session was the first in the set of three that addressed cloud services.
The idea of a symposium, explained Gianpaolo, is to take all of the various individual technologies and try to sort of fit the puzzle pieces together, providing a basic context.
The goal was also present some of the experience that Microsoft has gained in early usage of the Azure platform over the past 6-12 months. He said that he himself has spent the last 6-12 months using the new Services, so he had some thoughts to share.
This first session in the symposium focused on taking existing business applications and expanding them to “the cloud”. When should an ISV do this? Why? How?
Build vs. Buy and On-Premises vs. Cloud
Gianpaolo presented a nice matrix showing the two basic independent decisions that you face when looking for software to fulfill a need.
- Build vs. Buy – Can I buy a packaged off-the-shelf product that does what I need? Or are my needs specialized enough that I need to build my own stuff?
- On-Premises vs. Cloud – Should I run this software on my own servers? Or host everything up in “the cloud”?
There are, of course, tradeoffs on both sides of each decision. These have been discussed ad infinitum elsewhere, but the basic tradeoffs are:
- Build vs. Buy – Features vs. Cost
- On-Premises vs. Cloud – Control vs. Economy of Scale
Here’s the graph that Gianpaolo presented, showing six different classes of software, based on how you answer these questions. Note that on the On-Premises vs. Cloud scale, there is a middle column that represents taking applications that you essentially control and moving them to co-located servers.
This is a nice way to look at things. It shows that, for each individual software function, it can live anywhere on this graph. In fact, Gianpaolo’s main point is that you can deploy different pieces of your solution at different spots on the graph.
So the idea is that while you might start off on-premises, you can push your solution out to either a co-located hosting server or to the cloud in general. This is true of both packaged apps as well as custom-developed software.
Challenges
The main challenge in moving things out of the enterprise is dealing with the various issues that show up now when your data needs to cross the corporate/internet boundary.
There are several separate types of challenges that show up:
- Identify challenges – as you move across various boundaries, how does the software know who you are and what you’re allowed to access?
- Monitoring and Management challenges – how do you know if your application is healthy, if it’s running out in the cloud?
- Application Integration challenge – how do various applications communicate with each other, across the various boundaries?
Solutions to the Identity Problem
Gianpaolo proposed the following possible solutions to this problem of identity moving across the different boundaries:
- Federated ID
- Claim-based access control
- Geneva identity system, or Cardspace
The basic idea was that Microsoft has various assets that can help with this problem.
Solutions to the Monitoring and Management Problem
Next, the possible solutions to the monitoring and management problem included:
- Programmatic access to a “Health” model
- Various management APIs
- Firewall-friendly protocols
- Powershell support
Solutions to the Application Integration Problem
Finally, some of the proposed solutions to the application integration problem included:
- ServiceBus
- Oslo
- Azure storage
- Sync framework
The ISV Perspective
The above issues were all from an IT perspective. But you can look at the same landscape from the perspective of an independent software vendor, trying to sell solutions to the enterprise.
To start with, there are two fundamentally different ways that the ISV can make use of “the cloud”:
- As a service mechanism, for delivering your services via the cloud
- You make your application’s basic services available over the internet, no matter where it is hosted
- This is mostly a customer choice, based on where they want to deploy
- As a platform
- Treating the cloud as a platform, where your app runs
- Benefits are the economy of scale
- Mostly an ISV choice
- E.g. you could use Azure without your customer even being aware of it
When delivering your software as a service, you need to consider things like:
- Is the feature set available via cloud sufficient?
- Firewall issues
- Need a management interface for your customers
Some Patterns
Gianpaolo presented some miscellaneous design considerations and patterns that might apply to applications deployed in the cloud.
Cloudbursting
- Design for average load, handling the ‘peak’ as an exception
- I.e. only go to the cloud for scalability when you need to
Worker / Queue / Blob Pattern
Let’s say that you have a task like encoding and publishing of video. You can push the data out to the cloud, where the encoding work happens. (Raw data places in a “blob” in cloud storage). You then add an entry to a queue, indicating that there is work to be done, and a separate worker process eventually does the encoding work.
This is a nice pattern for supporting flexible scaling—both the queues and the worker processes could be scaled out separately.
CAP: Pick 2 out of 3
- Consistency
- Availability
- Tolerance to network Partitioning
Eventual Consistency (ACID – BASE)
The idea here is that we are all used to the ACID characteristics listed below. We need to guarantee that the data is consistent and correct—which means that performance likely will suffer. As an example, we have a process submit data synchronously because we need to guarantee that the data gets to its destination.
But Gianpaolo talked about the idea of “eventual consistency”. For most applications, while it’s important for your data to be correct and consistent, it’s not necessarily for it to be consistent right now. This leads to a model that he referred to as BASE, with the characteristics listed below.
- ACID
- Atomicity
- Consistency
- Isolation
- Durability
- BASE
- Basically Available
- Soft state
- Eventually consistent
Fundamental Lesson
Basically the main takeaway is:
- Put the software components in the place that makes the most sense, given their use