That Conference 2017 – From Dull to Dazzling: How Visualization Enhances Data Comprehension

That Conference 2017, Kalahari Resort, Lake Delton, WI
From Dull to Dazzling: How Visualization Enhances Data Comprehension – Walt Ritscher

Day 3, 9 Aug 2017

Disclaimer: This post contains my own thoughts and notes based on attending That Conference 2017 presentations. Some content maps directly to what was originally presented. Other content is paraphrased or represents my own thoughts and opinions and should not be construed as reflecting the opinion of the speakers.

Walt Ritscher

  • LinkedIn Learning – online video training (formerly Lynda.com)
  • Content available on both sites
  • 7100 courses
  • @WaltRitscher
  • Bit.ly/vizbooks
  • http://Xamlwonderland.com


Executive Summary

  • Showing data visually can make patterns and trends suddenly very obvious
  • Varying graphical data items’ size, color, position, contrast or shape can make a big difference in how a user views the data
  • Review of various data visualization tools and some examples of graphs


Your Data

  • We all have various data sources and lots of data
  • Big Data – lots of data
    • Gathering at unprecedented rate
    • Many sources–sensors, online transactions, medical, tweet streams
  • If you don’t use/analyze data, you’re a hoarder
    • Stored data is inert
    • Need to make it Actionable
  • Raw data can be hard to understand
    • E.g. Big spreadsheet
    • “Wall of text”


Visualizing

  • Bring human optical system into picture
  • Visual recognition is 60,000x faster than text recognition


Example – Anscombe’s Data

  • 4 sets of XY data, with averages and std deviation
  • Visually view same data
  • We see patterns
  • And pattern anomalies


Demo – Differences – Test

  • Changes to hue, saturation, size of object
  • Change shape–quickly recognize
  • Can’t find difference if original shapes very different


Differentiator

  • Size
  • Color
  • Position
  • Contrast
  • Shape


Highlight Differences with 5D

  • Use one of the differentiators to highlight a difference
  • If original set is uniform in shape, but different colors, you’d differentiate by shape to make it stand out
  • Counting # instances of “8” shape
  • Could colorize rectangles around digits
    • Heat map
    • Easy to spot area with higher numbers
  • Or could change size of shape by value
    • Bubble chart
    • Easy to see high growth
    • But harder to see actual numeric data


A Word about Colors

  • Chart–happy customers green, angry red
  • Don’t use green/red
  • Problem–green/red color-blindness about 10% can’t differentiate
  • Could do Blue / Red–easier to differentiate (99.5% population)
  • Example–colorizing heatmap
    • Which colors seem to be “higher” value color
    • Brain doesn’t do this
  • Rule: don’t differentiate by Hue
    • Differentiate by Saturation or Lightness


Motion and Animation

  • Demo – slightly changing something, moving something slightly on screen
  • Your brain sees the motion
    • Can see something moving even 1-2 pixels
  • Can animate data, showing stuff in sequence
  • If you have time-stamped data, consider animating that


Warning–eyes can be deceived

  • With a lot of lines, you can’t see more than a couple black dots
  • Pie chart – 30% slice looks smaller if it’s in the back
    • Never use 3D pie chart
  • Displaying change in one variable using area or volume
    • Showing relative sizes using area–doesn’t work


Terminology – The Buzzwords

  • Categories of visualization
    • Data Visualization – show data via graphical
    • Infographic – design friendly approach to visualization
    • Motion Graphics or Animated – use motion to accentuate


Data Viz Categories

  • Business tools – Excel
    • Excel continues analysis engine
    • Can launch in background and generate results or chart
    • Could take screenshot in the background
  • Drawing tools – Illustrator
  • Code tools – Visual Studio


Data Viz Tools

  • Infogr.am
  • Processing – programming language
  • ProcessingJS
  • Tableau
  • D3.js
  • R – language to manipulate data (no viz)


Data Viz Browser Tools

  • SVG
  • 2D Canvas
  • WebGL


StreamGraph

  • Sideways bar chart
  • Excellent
  • Web site allowing you to pull data out (babynamewizard.com) ?


Other

  • Glasswire – exploring data
    • Timeline with sliders
    • Nice alternative to having two calendar dropdowns


Infographic

  • Designer-friendly approach to data visualization
  • View uploads by category
    • Wedges in triangle
    • “Umbrella graph” ?
    • Callout coming out of wedge


Motion Graphics

  • NASA Perpetual Ocean
    • Latitude, Longitude
    • Flow direction & speed
    • Timestamp


Books

  • Visualization Data
  • Bit.ly/vizbooks
  • Tufte


Charts

  • Time-tested concept but still useful
  • Have done bar and line charts forever
  • But lots of new charts out there
  • New charts
    • Waffle chart–boxes
    • Hierarchical Edge Bundling–connecting nodes, bezier curves, trends pop out
    • Adjaceny matrix–e.g. Les Miserables Co-occurrence; awesome animation
  • D3.js
  • Bret Victor
Advertisements

That Conference 2017 – 12 Reasons Your API Sucks

That Conference 2017, Kalahari Resort, Lake Delton, WI
12 Reasons Your API Sucks – D. Keith Casey Jr

Day 3, 9 Aug 2017

Disclaimer: This post contains my own thoughts and notes based on attending That Conference 2017 presentations. Some content maps directly to what was originally presented. Other content is paraphrased or represents my own thoughts and opinions and should not be construed as reflecting the opinion of the speakers.

D. Keith Casey Jr

keith@caseysoftware.com

@CaseySoftware

  • Twilio, Okta, Clarify
  • A Practical Approach to API Design – theapidesignbook.com


Executive Summary

  • There are a number of things that you can do to deliver high quality API cod


Assumptions

  • APIs are important part of your job
  • Use them on a regular basis
  • Potentially build them too
  • Sometimes public, sometimes private
    • Same principles should apply
    • For internal API, if it’s awful, internal users can’t use another one
    • I.e. Adoption = in spite of your best efforts
  • Nothing is perfect
    • You make mistakes
    • Your providers make mistakes
    • That other team are knuckleheads
      • “Why do they work here”?


Developer Experience

  • “Developers are users too”
  • Don’t Make Me Think – Krug
    • When you make something figure something out, you’re taking time away from their main objectives
    • When you interrupt someone, you “reset” their work
  • Developers
    • Want to build something useful
    • Want to go home at end of day
  • Set aside an hour – you really tried
    • Phone calls, e-mails, IM, Slack, etc.
    • “Do you have a minute”?
    • TPS reports
  • At Twilio
    • 5-min onboarding experience


1 – Documentation

  • If delivering documentation via PDF–stop
  • HTML–github.com/lord/slate
    • Slate great, widespread use
    • You write in markdown
  • When people are ready to use the API, documentation needs to be ready
  • Let’s get interactive
    • Swagger – now Open API
      • Load up in browser and interact with endpoints via browser
      • Level of skill required to try out goes way down
    • (competitors) JSON Schema, API Blueprint


2 – Incomplete Docs

  • If you’re only documenting SDK, that’s not complete API documentation
  • Must have actual API reference docs
    • Exact syntax for each endpoint
    • High level of detail
  • JavaScript drinking game–random noun + .js
  • When you take dependency on 3rd party library
    • If it’s not popular, you’ll end up bringing it in house
  • Need reference docs + How To docs
    • Basic API is straightforward to figure out
    • But need to tell developers how to do something useful


3 – Getting Started Code

  • Sample code that solves a problem
    • How to do a “thing”
    • But nobody cares about that thing
  • 1st thing you need
    • Authentication
  • Need to give them sample code that solves an important problem
    • But we don’t know what’s important
    • Customer wants you to solve their specific problem
    • User judges docs based on whether it solves their problem
  • Work quickly to make someone successful


4 – “Innovative” Interfaces

  • Nobody really wants innovative
  • Everyone tries to create their own
    • HTTP verbs – important that they’re standard
    • Response codes – don’t do this
    • Your own protocols – also huge ecosystem out there
      • Probably not qualified to build your own
    • “Don’t Be Dumb”


5 – Authentication

  • Don’t roll your own “encryption” scheme
  • Don’t roll your own “new” methods
    • You’re not qualified to create something new
    • Good encryption scheme have been out there a long time and have been hammered on
    • You’re new scheme won’t have been deeply reviewed and tested
  • Use existing scheme like Oauth
    • Less training required – for users
    • Reuse common libraries for clients & server
    • Faster on boarding – for internal developers


6 – Inconsistencies

  • Consistency of URIs
    • See some URIs, you want to extrapolate to figure out other URIs
  • POST -d {data}
    • Use 201 Created and Location header
    • Don’t do 201 for most, then 200 for one
  • If you are inconsistent/wrong on day 1, you’ll have to support that mistake for years


7 – Poor Modeling

  • Example–coffee cup w/handle
    • Accomplishes primary goal (don’t get burned)
  • When building API, what is user’s primary goal
    • Single sentence, primary use case
  • Affordances
    • What problems/tasks does it make simple?
    • What is the API producer’s goal?
    • What do you want to do?
    • (or) Why are people giving us money?


8 – Stack Overflow Problem

  • Many different way to do something
    • Other places are replicating your documentation, advising how to do something
    • And these documents end up on Google
  • E.g. Multiple ways to pass auth token into API
    • E.g. Auth Header, URL, or body
  • Just give them one way to do something


9 – Your Sh.. Stuff Is Broken

  • Is Support run by developers?
  • What does your uptime look like?
    • Never 100%
    • Cost increases exponentially as you add digits to %
    • Two digits easy
  • Do you have a Trust page?
    • Usually yourapi/trust — status of issues, history, what happened
    • Need to be open with the devs who are using your API
  • Need to make sure your stuff works
    • SLA in place once people are using it
  • Core hours during which API must be up
    • And some users out there who need it during off hours


10 – Error Messages

  • Bad error message–unhelpful
  • “Don’t do this to people”
  • E.g. 404 – “Item Not Found” — not great, just repetitive
  • Error code
    • 404, Item not found, E000007
  • Add error_code and more_info with URI that explains error code
    • Simple
    • And you better have information on the error at the page
  • E.g. HAL – additional links with every payload
    • Typically return this stuff with an actual resource
  • Could avoid response body entirely
    • Just set 404
    • And add some stuff to header (spec allows this)


11 – Logging and Debugging

  • API going to break, unavoidable
  • RunScope – good tool, proxying, catching web hooks
    • Web-based
    • Not appropriate in regulated field
    • Don’t use if you can’t leak private data
  • Fiddler
  • Postman
  • Building Your API Utility Belt
    • Another talk
    • Use the right tool for the job


Building

  • “Principle of Least Surprise”
  • Designed workflows
  • The One True Way
  • Authentication


Maintaining

  • Error messages & handling
  • Logging & debugging


12 – Do you Have a Business Model

  • You’re building trust with users
  • Fastest way to destroy trust is to disappear as a business
  • If your business disappears, you’ve just screwed somebody
  • API must support bottom line of API builder
  • As customers become more successful, scaling up with your API, you make more money
  • If just an experiment, tell people that

That Conference 2017 – The Static Web Revolution

That Conference 2017, Kalahari Resort, Lake Delton, WI
The Static Web Revolution – Steven Hicks

Day 2, 8 Aug 2017

Disclaimer: This post contains my own thoughts and notes based on attending That Conference 2017 presentations. Some content maps directly to what was originally presented. Other content is paraphrased or represents my own thoughts and opinions and should not be construed as reflecting the opinion of the speakers.

Executive Summary

  • Lots of good reasons to publish static web site, rather than dynamic
  • Need to select a Static Site Generator
  • Lots of tools to choose from, but there are tradeoffs


History of the Web

  • “Before” time
    • 1997 – static HTML
    • Scrolling / blinking
    • Hard to maintain static files
  • Dynamic back-ends
    • PHP, ASP
    • ASP.NET, Rails, Node
  • CMS on top of frameworks
    • WordPress, Drupal
    • Marketing guys could now manage the actual content
  • Bootstrap
    • Everything based on Bootstrap
    • Tons of stuff is based on WordPress


How did we get here?

  • Bootstrap & CMS for all content
  • “Simple enough for anyone to use”
  • “Any other way would be a maintenance nightmare”


A Hero Rises

  • “Are we using the right tool”?
  • “Is your content truly dynamic”?
  • “How often does your content change”?
  • “Is a dynamic site worth the cost of support?”


Surprising Numbers

  • 28% of sites on Internet run on WordPress
  • 70% of WordPress installs are vulnerable
  • 20% of top WordPress plug-ins are vulnerable


Goals

  • Compelling alternative to traditional dynamic website


Dynamic Website

  • Content in database
  • HTML generated when user requests it
  • Content managed with CMS
  • Admin tool to administer stuff, add new content, etc.
    • Can be in CMS
  • Magic happens between database and CMS
    • Convert data to HTML
    • Happens for every user


Static Website

  • Static files
  • Generated when content changes
  • User request–return static HTML
  • Static site generator–generates the HTML
  • The magic happens less frequently, when content changes


Why Go Static

  • Speed – static HTML is faster
    • CDN ready
  • Security
    • Fewer moving parts that can be attacked
  • Simplicity
    • Fewer points of failure
    • Don’t worry about database, caching, etc.
  • Scalability
    • Static is easier to scale out
  • Source Control
    • Just store content in CM
    • Rather than database data, which doesn’t automatically track history


How Do I Get Started?

  • Static Site Generator
    • Command-line tool
    • Flat files as input
    • Output is complete website of static HTML


How Do I Customize?

  • CSS/SASS
  • Themes (if you’re lucky)
    • Not a ton of support for it, though


How Do I Write Content?

  • Header & Nav don’t change (Template)
  • Content


Templates – EJS

  • Inject stuff like title, conten


Templates – Other

  • Handlebars
  • Pug (Jade)


Writing Content

  • Text editor, markdown
  • Add metadata
    • Depends
    • FrontMatter – e.g. keyword/values in +++ section
    • Separate files


Choosing a Generator

  • Staticgen.com
  • 200+ different generators
  • Potential criteria
    • Engine language
    • Templating language
    • Simplicity vs. Customization
    • Extensibility
    • Frontmatter Support


Suggestions

  • Jekyll
    • Pro: Support, themes, features
    • Cons: Difficult setup, bad support for Windows
  • Hugo
    • Pros: Fast, Support, Themes
    • Cons: No extensibility
  • Harp
    • Pros: Simple
    • Cons: No extensibility, no frontmatter
  • Gatsby/Phenomic
    • Pros: React, PWA, Momentum
    • Cons: React, Young (you might not like React)


How Do I Host?

  • Existing infrastructure – drop the files on the server
  • Amazon S3
  • Dropbox – public folder
  • Github Pages –
  • Dedicated Static Hosts (surge, Aerobatic)


Netlify

  • Features
    • Run builds
    • Free custom domains
    • Free SSL
    • Baked-in CDN
    • Free Pro upgrade for open source projects


When Should You Go Static?

  • Blogs
  • Online magazine (large site)
  • Portfolios
  • Brochures
  • Docs
  • Style Guides
  • Events


JAM Stack

  • JavaScript
  • APIs
  • Markup


JAM Stack Examples

  • Forms – Google Forms, TypeOne
  • Commerce – SnipCart
  • Site Search – Lunar.js
  • User Data – Firebase – offload user data to somebody else
  • Serverless – functions as a service – Azure Functions, Lambda Services, serverless.com
  • See theNewDynamic.org/tools


Supporting Content Authors

  • Need visual authoring tool for content generators (non-technical)
  • Text Editor
  • Headless CMS – CMS as a service
    • Without headless CMS – .git repo feeds generator
    • With headless CMS – feed static generator from headless CMS in cloud
  • Contentful – still have to write markdown
  • Netlify CMS
    • Integrates with .git
    • Drop something in your repository
    • Uses GitHub authentication to log into CMS (admin page)
    • Content saved to your repository
    • Integrates nicely with Netlify
  • headlessCMS.org


The Static Web Revolution

  • New way – more modular, smaller tools that each does a different thing

That Conference 2017 – Continuous Database Deployment

That Conference 2017, Kalahari Resort, Lake Delton, WI
Continuous Database Deployment – Mike Acord

Day 2, 8 Aug 2017

Disclaimer: This post contains my own thoughts and notes based on attending That Conference 2017 presentations. Some content maps directly to what was originally presented. Other content is paraphrased or represents my own thoughts and opinions and should not be construed as reflecting the opinion of the speakers.

Executive Summary

  • Important to have database assets under source control
  • Several different strategies, each with pros/cons
  • Most likely approaches–schema-based or script-based


Continuous Deployments

  • We continuously deploy our code
    • Automated deployments
    • Integration tests
    • Build pipeline
  • But we’re not doing this with databases
    • More difficult
    • Harder to avoid losing data
    • More risky


Databases – Instead

  • Manual change scripts
  • Schema comparisons (SQL Compare)
  • Create change script, run against database
  • Problems
    • Error prone
    • Risky
    • Slow


What Is Continuous Database Deployment

  • Automated deployments
  • Integration tests
  • Build pipeline
  • Diagram
    • Database should be in source control
    • Trigger from changes in CM leads to Continuous Integration
    • Leads to Release Management
    • Staged releases–QA, Test


Benefits

  • Less error prone, more repeatable
  • Anybody can do this
  • Testable–deploying to different environments
    • At each phase, you’re re-testing the deployment script
  • Faster to release
  • Easier to merge branches
    • Work item has unique code and unique database changes
  • Easier to refresh environments
    • And easier to refresh to different environments
    • Stuff already in DEV, ready to go


Methods

  • Schema-based
    • Current schema is always in source control
    • Script in CM for every object in your database
  • Script-based
    • Change scripts checked in and applied
    • You just create a delta script
    • Never have full history
  • Code-based
    • Apply database changes from code
    • E.g. Migrations in .NET


Schema Based

  • Provides history of schema
  • Can generate change scripts to reverse changes
  • Less likely to screw up change scripts
  • Compare and sync schema to deploy changes
  • Can make to database and sync back to model
    • Can generate blank database at an point
  • Challenges
    • Merging can be challenging
    • Black voodoo magic leads to nasty change scripts
    • Deployment process is more complex
    • Forget to check in changes and lose them


SQL Source Control

  • Shows changes
  • Create migration scripts
  • Deals with situation of adding non-nullable columns


Visual Studio Database Project

  • SQLCompare utility to apply schema changes


Example

  • Octopus Deploy used to deploy changes
  • Creates Powershell scripts (or uses scripts)
    • Applying .dacpac files using SQL Package cmd line tools
  • NB: RedGate has plugins to SQL Source Control for various CI tools


Script Based

  • Create change scripts to apply
    • Script for every change
  • Simple to understand
  • Easy to implement
  • No black magic
  • Easy to test scripts
  • Challenges
    • May need to review change scripts
    • No ‘current’ database that you can deploy
      • Backup/restore to get copy of current


Example – DBUP + Octopus Deploy

  • Storing migration scripts in DB project
  • DBUP uses LOG in database to see what needs to be run; applies migration in order
  • Also then applies environment-specific scripts


Code Based

  • Quick setup for new developers
  • Allows for seed data
  • Simple to use
  • Can generate change scripts
  • Challenges
    • Testing changes can be awkward
      • Hard to test because migration only happens in the context of running your app
    • May feel unnatural
    • Rolling deployments more challenging
      • One app does the update, but other apps aren’t yet up to date


Example – Entity Framework Migrations


Database Refresh – Keep Data Current

  • Quickly refresh environments from production
  • Can be scheduled to happen daily
  • Allows better diagnosis of data issues
  • Daily testing of deployment scripts
    • By virtue of pushing between DEV / STAGE / PRODUCTIOn
  • Challenges
    • Clustered servers
    • Large databases
      • Can be painful, >1TB


Example – Database Refresh

  • Using Octopus Deploy


Best Practices

  • Create backups or snapshots before doing anything else
  • Avoid data loss
  • Automated integration tests
  • Deploy breaking changes in steps
  • Example (field rename)
    • Add new field with synchronization trigger
    • Modify application to use new column
    • Remove old column and synchronization trigger
  • Do this in stages when you can’t take time window to do everything at once
  • Lots of patterns for applying changes in non-breaking patterns like this
    • Agiledata.org/essays/renameColumn.html
    • Book – Refactoring Databases


Questions

  • Problem with database project – comparing two schemas, scripts that it generates just don’t work
    • Turn on setting to catch errors on project build

That Conference 2017 – Concurrent Programming in .NET

That Conference 2017, Kalahari Resort, Lake Delton, WI
Concurrent Programming in .NET – Jason Bock (@JasonBock)

Day 2, 8 Aug 2017

Disclaimer: This post contains my own thoughts and notes based on attending That Conference 2017 presentations. Some content maps directly to what was originally presented. Other content is paraphrased or represents my own thoughts and opinions and should not be construed as reflecting the opinion of the speakers.

Executive Summary

  • Doing concurrency correctly is very hard when doing it at a lower level
  • Don’t use Thread, ThreadPool anymore
  • Learn and use async/await


Background

  • Concurrency is tough, not as easy as doing things sequentially
  • Games are turn-based
    • But interesting exercise to do chess concurrently
    • Much more complicated, just as a game
  • We’ll just look at typical things that people struggle with


Terms

  • Concurrency – Work briefly and separately on each task, one at a time
  • Parallelism – People all working at the same time, doing separate tasks
  • Asynchronous – One you start a task, you can go off and do something else in the meantime


Recommendations

  • Stop using Thread, ThreadPool directly
  • Understand async/await/Task
  • Use locks wisely
  • Use concurrent and immutable data structures
  • Let someone else worry about concurrency (actors)


Stop Using Thread, ThreadPool

  • This was the way to do concurrency back in .NET 1.0
  • Diagram – Concurrent entities on Windows platform
    • Process – at least one thread
    • Thread – separate thread of execution
    • Job – group of processes
    • Fiber – user-level concurrent tasks
  • But mostly we focus on process and thread
  • Example–multiple threads
    • Use Join to observe at end
  • Problems
    • Stack size
    • # physical cores
  • # Cores
    • May not really be doing things in parallel
    • More threads than cores–context switch and run one at a time on a core
    • Context switches potentially slow things down
    • Don’t just blindly launch a bunch of threads
  • Memory
    • Each thread -> 1 MB memory
  • ThreadPool is an improvement
    • You don’t know when threads are complete
    • Have to use EventWaitHandle, WaitAll
    • Lots of manual works
    • At least threads get reused
  • Existing code that uses Threads works just fine


Async/Await/Tasks

  • Models
    • Asynchronous Programming Model (APM)
    • Event-based Asynchronous Patter (EAP)
    • Task-Based Asynchrony (TAP)
  • Demo – console app
    • AsyncContext.Run – can’t call async method from non-async method
    • AsyncContext – from Stephen Cleary – excellent book
    • This goes away in C# 7.1 – compiler will allow calling async method
    • Reading file
  • Misconception – that you create threads when you call async method
    • No, not true
    • Just because method is asynchronous, it won’t necessarily be on another thread
  • Use async when doing I/O bound stuff
    • Calling ReadLineAsync, you hit IO Completion Point; when done, let caller know
    • When I/O is done, calling thread can continue
    • Asynchronous calls may also not necessarily hit IOCP
  • If you do I/O in non-blocking way on a thread, you can then use thread to do CPU work while the I/O is happening
    • Performance really does matter–e.g. use fewer servers
  • When you call async method, compiler creates asynchronous state machine
    • Eric Lippert’s continuation blog post series
  • Task object has IsCompleted
    • Generated code need to first check state to see if task completed right away
  • Good news is–that you don’t need to write the asynch state machine plumbing code
  • Can use Tasks to run things in parallel
    • Task.Run(() => ..)
    • await Task.WhenAll(t1, t2);
  • Tasks are higher level abstraction than threads
  • Don’t ever do async void
    • Only use it for event handlers
  • Keep in mind that async tasks may actually run synchronously, under the covers


Demo – Evolving Expressions

  • Code uses all cores available


Locks

  • Don’t use them (or try not to use them)
  • If you get them wrong, you can get deadlocks
  • Don’t lock on
    • Strings – You could block other uses of string constant
    • Integer – you don’t lock on the same thing each time, since you lock on boxed integer
  • Just use object to lock on
  • Interlocked.Add/Decrement
    • For incrementing/decrementing integers
    • Faster
  • Tool recommendation: benchmark.net
  • SpinLock
    • Enter / Exit
    • Spins in a while loop
    • Can be slower than lock statement
    • Don’t use SpinLock unless you know it will be faster than other methods


Data Structures

  • List<T> is not thread-safe
  • ConcurrentStack<T>
    • Use TryPop
  • ImmutableStack<T>
    • When you modify, you must capture the new stack, result of the operation
    • Objects within the collection can change


Actors

  • Service Fabric Reliable Actors
  • Benefits
    • Resource isolation
    • Asynchronous communication, but single-threaded
      • The actor itself is single-threaded
      • So in writing the actor, you don’t need to worry about thread safety


Demo – Actors – Using Orleans

  • “Grains” for actors

That Conference 2017 – Refactoring Monolith Database Stored Procedures

That Conference 2017, Kalahari Resort, Lake Delton, WI
Refactoring Monolith Database Stored Procedures – Riley Major (@RileyMajor)

Day 2, 8 Aug 2017

Disclaimer: This post contains my own thoughts and notes based on attending That Conference 2017 presentations. Some content maps directly to what was originally presented. Other content is paraphrased or represents my own thoughts and opinions and should not be construed as reflecting the opinion of the speakers.

Executive Summary

  • Re-factoring large stored procedures helps with testability, comprehensibility, performance
  • Main strategies–reorganize, factor out pure business logic into UDFs, do updates at the end
  • Testing strategy–use transactions to undo test calls to before/after versions of stored proc, store data to compare in table variables


Monowhat?

  • Evolves over time
  • Long
  • Does multiple things
  • Disorganized
  • Fragile
  • Untestable
  • Scary


Layers, Like an Onion

  • Programming layers
    • Presentation
    • Business Logic
    • Data Storage
  • Tiers
    • Client app (HTML)
    • Server (ASP.NET)
    • Database (SQL Server, NoSQL)
  • Best Practice
    • Presentation in client application
    • Business Logic in server


Oh Noes!

  • Monolith
    • Presentation in database
    • Business Logic in database
  • Bad
    • Database scaling is hard
    • Causes vendor lock-in
    • Database languages are primitive
  • But
    • Close to data, less overhead
    • Who really changes databases?
    • SQL more powerful than you think
      • Could be faster to make changes to multiple tables down in the database


Turtles All the Way Down

  • Separate layers even on the same tier
    • Browser: MVVM
    • Server: MVC
    • Database: TBD
  • Database layer
    • Presentation / Business Logic
      • IF / CASE / SET / SUM / DATEADD
      • If this logic is going to be here anyway, we should try to architect it
    • Data access
      • SELECT / UPDATE / INSERT / DELETE
  • Testability, Isolation, Portability
    • Reasons to structure bus logic in database


Make a plan

  • What are goals?
    • Better performance?
    • Maintenance?
    • Understandability?
    • Testability?
  • How will you know you’ve achieved the goals?
    • Speed benchmarks
    • Less repetition
    • Smaller sections of code
    • Actually having a testing suite


Survey the damage

  • Can’t avoid a thorough code review
  • Look for data modification
    • INSERT, UPDATE, DELETE
    • Note columns affected
  • Look for external effects
    • CLR
    • E-mail generation
    • SPS: Triggers
  • Look for transaction handling
    • BEGIN TRAN, COMMIT, ROLLBACK
    • Harder to re-factor if existing code uses transactions
    • Can have nested transactions
    • Rollback goes all the way up the stack


Don’t Break Anything

  • Build a development environment
    • Need to be able to play around
    • Need realistic data (volume and content)
    • Maybe not real data
  • Work in isolation
    • Were changes the result of you or somebody else?
    • You really need to isolate your changes
    • Slow because resources being used elsewhere?
  • How can you tell if you broke something?
    • Need to capture before and after state
      • Look across entire database potentially
    • Aim for deterministic process
    • Easy to know if you broke it if you know what it’s supposed to do


Deterministic

  • Function returns the same result, given the same inputs
  • Easy to test – send same values in before/after your changes
  • Things that break determinism
    • Random numbers
    • Time
    • Pulling data from database (underlying table’s contents can change)


Play It Again, Sam

  • Why we want determinism–so you can compare data before/after your changes


Good Luck with That

  • (Monolith stored proc) Likely not deterministic
  • Monoliths change state
  • Need to go back in time
  • Can use transactions to do this
    • Revert to previous state


Become a wrapper

  • To test impact of code changes, wrap your calls
    • Begin transaction
    • Run original code
    • Capture changed data
    • Rollback
    • Run new code
    • Capture changed data
  • Compare the 2 captured data sets


Oops

  • But need to save changes somewhere
  • Captured data is also rolled back
  • Need to preserve the changes that you captured, even during rollback
    • Could save to local storage
    • Could save to another database
    • Print results to database console


Build a Ghost House

  • How capture doomed data?
    • Outside SQL Server–hard
    • Another thread with NOLOCK–hard
  • What’s immune from transactions?
  • Variables
  • You can’t have a variable for every row
  • One big XML? Ouch
  • Table variables survive transactions
    • Written to disk, but not stored to database
  • They’re ghost houses


Spooky Playground – Create House

  • DECLARE @Orders TABLE (Field1 int, Field 2 int);
  • Could use tricks here–store checksums or hashes instead of actual data
  • Typically create one table variable for each DB table that will get changed
    • And set only the columns that you expect to change


Spooky Playground – Fill House

  • BEGIN TRAN; EXEC monolisth;
  • UPDATE @Orders SET x = x FROM Orders
  • ROLLBACK
  • BEGIN TRAN; EXEC monolith_New


Spooky Playground – Compare

  • SELECT * FROM @Orders WHERE colA_Before <> colA_After


Mock Your Black Boxes

  • Transactions only work on the database
  • External effects aren’t rolled back
  • Replace external calls with “mocks”
  • They look and act like external calls
  • But you control the guts
  • Return hard-coded sample data
  • Have the mock log its inputs
    • You’ll need to see what was sent, to make sure it would have done the same thing


Make your time!

  • Date/Time functions kill determinism
  • You have to control “now”
  • Otherwise no two runs could be the same
  • So make your own time
  • And send it in as a parameter
    • Feed the monolith the current date/time


Your Petard Should Hoist

  • Move variable DECLAREs to the top
  • Reveals duplication
  • Reveals common data sources
  • Displays breadth of data required
  • Caution: DECLARE assignment
    • Leave SET down below when you pull the DECLARE up


One SELECT to Rule them All

  • Gather scattered SELECT statements to top
  • Reveals duplication
  • Prepares for separation
  • Prepares for shorter transactions
  • Use single SELECT with fancy SQL, if practical


Measure Twice, Cut Once

  • Find INSERT / UPDATE / DELETE
  • Replace with variables SETs
    • Store what these statements were supposed to do
  • Move data modification to end of proc
    • Shrinks amount of time when transactions are open
  • Results in 3 main sections
    • Data gathering
    • Computation
    • Data modification


Cases of CASES

  • What’s left in middle? Logic
  • Lots of Ifs, SETs, and calculations
  • Pull it all together in one giant statement
  • Usually performs better
  • Can be clearer
  • Can reduce code
  • Prepares for separation
  • CASE, Derived Tables, and CTEs are your friends


Building Blocks

  • Still one procedure = still a monolith
  • Separate
    • Data Gathering – inline UDFs
    • Calculation – inline UDF
  • Allows data gathering re-use
  • Allows testing suite for business rules
  • Allows read-only monolith actions
    • Most important benefit
    • Can tell people what the business logic will do
    • Data in, data out
    • May want to use this function elsewhere


It’s All Better Now

  • Reformed monolith
    • Recently written
    • Short
    • Orchestrates multiple things
    • Repeated code eliminated
    • Organized into functions
    • Vials of reagents to mix – based on pieces
    • Problems isolated
    • Testable
    • Bening–not scary


Note on functions

  • Scalar user-defined functions in SQL Server perform much worse than inline table-valued user-defined functions
  • Treated like the engine just like a view
  • Especially bad performance when you use result of a scalar function in a WHERE clause
  • Other problems with multi-statement table-valued function


Demo – Walk through this process

  • Mock out stored proc that sends e-mail
    • Just store input data
  • To test encapsulation, run monolith twice, in transaction
    • You’ll see differences for non-deterministic stuff, e.g. dates
  • Need to look through the differences and resolve them
    • e.g. by feeding in date as procedure
    • Make copy of Monolith, but change only the date parameter business
    • If you now see now results, it’s now deterministic
  • Now start re-factoring, but create different versions of the monolith
    • e.g. Monolith_HoistVariables
    • Move variable DECLAREs up to top
  • Beware of false positives
    • You might see no differences, but that could be due to having no data in database that exposes a bug that we just created
    • SPS: Or that test data doesn’t cause execution of a path where a bug is fixed
  • CROSS APPLY as performance improvement
    • Create new record sets on the fly
    • This is performance efficient
  • Do one UPDATE at bottom to make all of our changes
  • Move multiple IF statements into single bit SELECT statement
    • Keep re-running the compare harness, continue seeing no changes after compare
  • Move hard-coded strings into function that returns table having named columns with constant values
    • No performance hit to do this
  • Pull many statements together into one big SELECT
    • Can then move into its own UDF
  • Make giant list of test data
    • Then use CROSS APPLY to pass this data into new UDF
    • Can then do regression testing
    • After future changes, you can say explicitly what’s going to break (change)

That Conference 2017 – The Rise of JavaScript-Driven Native App Development

That Conference 2017, Kalahari Resort, Lake Delton, WI
The Rise of JavaScript-Driven Native App Development – Rob Lauer

Day 1, 7 Aug 2017

Disclaimer: This post contains my own thoughts and notes based on attending That Conference 2017 presentations. Some content maps directly to what was originally presented. Other content is paraphrased or represents my own thoughts and opinions and should not be construed as reflecting the opinion of the speakers.

Executive Summary

  • Web apps not performant enough, native apps too hard to implement
  • Hybrid apps too much of a compromise
  • Xamarin just “1st-gen”
  • Move toward React Native / NativeScript
  • NativeScript comes with a bit more “out of the box”

Rob Lauer – @RobLauer

  • Senior Manager, Developer Relations
  • Progress (Telerik)

Two things

  • JavaScript-Drive Native
  • NativeScript

Covering today

  • Rise of “JavaScript-driven Native”
  • Intro to NativeScript
  • NativeScript core concepts

Environment for apps

  • Native apps are in silos, for the various platforms

Cross platform options

  • Mobile Web – PWA (Progressive Web Apps)
    • Mobile-first web site, or mobile variant
  • Hybrid – leveraging web view on device (e.g. Chromeless browser)
  • 1st gen X-Plat Native
    • e.g. Xamarin
  • Native

Not available to web app

  • Offline support
  • Device APIs
  • Home screen availability

Hybrid Promise

  • In between native and full web
  • Reuse web skills
  • Rich UI
  • Supposed to be best of both worlds
  • It’s basically a set of compromises

Hybrid Problems

  • Lots of press talking hybrid apps down

Hybrid Reality

  • 50% dev time doing first 80%
  • Another 50% doing final 20%
  • Need to tweak things because of bad performance

Binary choice

  • Hybrid vs. Native
  • Hybrid
    • Fast to market
    • Compromise on UX
  • Native
    • Best experience
    • One platform at a time

Pie chart

  • How important is mobile app performance
  • 80% somewhat/very important

If user perceives poor performance

  • Switch to another app, etc.

Third choice

  • Native JavaScript
  • Native UI driven by JavaScript
  • App runs in JavaScript engine, bridge to native

Biggest JavaScript Native

  • React Native
  • NativeScript (Progress)
  • Other smaller ones
    • Weex
    • FuseTools
    • Flutter

React Native vs. NativeScript

  • Both work in similar way
  • Native UI, Native APIs running on JavaScript engine
  • NativeScript – write once
    • Plugins created with JS/TypeScript
    • On Day 0, NativeScript ready to go
    • Single-threaded (UI thread)
  • React Native – write API wrapper for each platform
    • React support
    • API access via native modules (you write)
    • UI Thread vs. JavaScript Thread
  • Performance between the two is very similar

{N} vs RN

  • React vs. Angular/Vue/Vanilla
  • Progress vs. Facebook
  • BSD (RN) vs. Apache
  • Community – RN community is bigger

JavaScript-Driven Native

  • Faster to market
    • Reuse existing skills
    • Reuse existing libraries
  • Best experience
  • True native UI

Xamarin

  • Progress/Telerik still supporting the idea of Xamarin and cross-compiled web apps
  • e.g. telerik.com/xamarin-ui

Intro to NativeScript

  • Timeline – 2013 to 2017 (mass adoption)
  • npm downloads going up
  • Ionic still dominant
  • RN and {N} battling it out

What is NativeScript?

  • Open source framework
  • Native mobile for iOS, Android (eventually Windows 10)
  • Use web skills
  • Write once, run everywhere
    • Share 100% code between iOS/Android
    • Share 80% code with web – can easily port Angular, but just rewrite view

Differentiators

  • No compromise UI
  • Measurable native UI performance
  • Maximum code and skill reusability
  • Reuse existing native libraries

How Does NativeScript Work?

NativeScript Module Layer (NML)

  • Abstractions on top of native APIs
    • Dozens available out of the box
    • All native APIs still available at JavaScript layer

Architecture

  • Application Code  (can occasionally hit NativeScript Runtime)
  • NativeScript Modules
  • NativeScript Runtime
  • iOS / Android at bottom

Putting It All Together

  • Define UI with Markup
  • Back-end logic is JavaScript
  • CSS

Architecture choice

  • JavaScript
  • TypeScript
  • Angular

First App

  • npm install
  • tns run iOS
  • tns run android

NativeScript LiveSync

  • Refresh app with latest changes to JS, CSS, XML
  • No re-build
  • Works with emulators and devices

Demo – Simple  NativeScript App

  • tns create myapp
  • main-page.xml
    • A few HTML elements, though custom Telerik

Pages/Views

  • XML markup structure
  • Elements (<Page>, <Label>) are NativeScript modules
    • Cross-platform abstractions
    • E.g. <switch>
  • Lots of UI Widgets out of the box

Layouts (Traditional)

  • Layout containers
    • Absolute
    • Dock
    • Grid
    • Stack
    • Wrap
  • Power comes from when you start nesting layout containers inside of each other
  • Also provide Flexbox

Custom XML Components

  • Custom controls
  • Encapsulate reusable UI in components
  • Can be JS only or XML + CSS + JS

Platform-Specific Capabilities

  • File naming: xxx.android, xxx.ios
  • Markup: chunks – <android>,
  • Attributes: android:blah, ios:blah
  • Write once by default
    • But can customize per target

Styling with CSS

  • Conventions
    • App.css
    • Myview.css
    • Myview.ios.css

Supported Selectors

  • Element Type, class, id

Sass and LESS Available

UI Components

  • Nativescript-ui

JavaScript Code Behind

  • Vanilla JavaScript
  • Built-in MVVM pattern
  • Angular support
  • TypeScript support

Handling Events

  • Create event handlers in JavaScript

Navigating Views

Basics

  • Navigating with topmost

View Transitions

  • Standard, e.g. curl

Data Binding

  • Available out of the box
  • Two-way

Animations

  • Animate – various properties
  • Configured – props
  • Chain animations
  • Animate multiple properties and elements

Custom Fonts

  • TTF or OTF
  • Drop-in

Debugging Strategies

  • Console.log
  • Developer Tools – e.g. Chrome DevTools
  • IDE – e.g. Visual Studio and Visual Studio Code
  • Free extension for Visual Studio Code

Anything Else?

Two Ways to use

  • Command Line Interface (CLI)
  • NativeScript Sidekick
    • Builds on top of CLI
    • Services like cloud-based builds
    • Cross-platform
    • Starter kits, app templates
    • Plug-ins management
    • Visual Studio integration coming

What about Angular?

  • Unified app concepts across web and native mobile

Community

  • Plugins.nativescript.org
  • Forum.nativescript.org

Bottom Line

  • No web views (platform native UI)
  • Use JavaScript (or TypeScript)
  • 100% access to all native APIs
  • Made for web devs (JS, CSS, XML)
  • Use Angular for web and ntive mobile
  • Reuse thousands of libraries from Node/iOS/Android/Web