Nathan Herald

My core values

2015-03-18T03:38:15-07:00

Other’s needs and wants are more important than your own
Surprise and delight those you connect with
Thank people regularly for what they do
Take risks from time to time
Ambitiously pursue solutions for hard problems
Positively encourage other’s ideas and aspirations
Be content, but not complacent with how things are currently
Prioritize calmly, work aggressively
Gladly do the work other’s don’t want
Focus on details while considering a work holistically

How we allow any request to be safely repeated at anytime for Wunderlist

2015-02-18T08:02:00-08:00

Stripe recently introduced an Idempotent requests feature for their api calls to protect against duplicate charges caused by network failures. While building Wunderlist 3 we needed a similar mechanism for our remote clients to be able to be safely repeat any requests. Mobile networks, corporate firewalls, and general internet connectivity issues are problems way too often. One cannot be sure an operation has successfully arrived without receiving a full, parseable response.

We took a different route to protecting our data and I’d like to detail it for future reference. It could be useful for other’s to compare and contrast Stripe’s and Wunderlist’s methods for idempotent requests.

With stripe, one includes an Idempotency-Key header with every request. If a request comes in within the next 24 hours with the same header value then Stripe will return the original response. One could build something like this utilizing redis or memcached to hold all raw responses keyed by the idempotency header’s value with a TTL. 24 hours makes sense for stripe since their clients are mostly background workers on servers. Surely even 1 hour would probably be enough, since most server workers are always working and would retry very quickly.

For Wunderlist, we have clients in remote places that sometimes are offline for days. A user might use their iPhone constantly for weeks, but only open their iPad once every week. Or a person might be on a web client at work, but a phone at home. Instead of using an in-memory cache of responses, we decided to let our databases do most of the work for us. I always favor the dumbest solution that works and so that is what we did (:

All GET requests (which are always reads) to the Wunderlist apis are safe already, so that part is easy.

For write operations every Wunderlist client maintains an an operation queue that is serializable to disk. Every write request is first added to the operation queue. Then, on the other side, an operation worker marks items as in-progress, finished, etc as it processes the requests. Right before a client is quit: it writes the current state to disk. This state is restored into memory on app boot. A request could be cut short by network issues, memory pressure, or just the use quitting the app. The request could have succeeded or it could have never made it to the server, it’s impossible for the client to know.

POSTs (which are always creates) are the hardest since the client has zero knowledge about the server’s state. PUTs (which are updates) are easier, since the client already knowns the server’s identifier for the object being updated. And we decided DELETEs always win and always succeed.

PUTs are easy because our sync algorithm already handles them. Every object in the Wunderlist system has an identifier and a revision. Every update to an object safely increments it’s revision counter by at least 1. Anytime a client updates an object it must send what it knows to be the current revision and identifier along with the properties to update. If the client’s assumed current revision is incorrect, the server immediately rejects the request and asks the client to fetch and merge the newest information. I am not ashamed to say this is identical to how Couch DB works, which is where we got the idea. If a client were to replay a successful update, it would fail because the revision would have changed and the client would be out of date. The client would fetch the new information, merge that in, notice nothing is out of date, and reconcile the operation in the queue to be completed.

For all POST requests, the client generates a header we named X-Client-Request-ID which is basically the same as stripe’s Idempotent-Key header. This request id is generated as part of the request operation that is added to the in-memory queue, this way it is serialized to and from disk as needed. This means a client can attempt to replay a POST request even if the app was offline for a long time.

On the server side, we let the database do the hard work for us. Every table for all synced objects has the aforementioned id and revision columns along with a created_by_request_id column. The request id column has a unique constraint, so there can never be two. To prevent clients from stomping on top of each other’s request id’s, other metadata like the current user’s id and the current device’s identifier are prepended to the already random request id. Since the operation queue is not synced between devices that is enough to keep this super unique.

Our apis return the created_by_request_id field in the object’s json so that clients can reconcile any operations currently stuck in their operation queue without having to retry the request.

All Wunderlist clients only perform GETs during a sync and only perform any writes when flushing the operation queue. If a client is currently in read mode then it will scan through the operation queue for each newly downloaded object’s request id, attempting to reconcile an in-memory request if one is found. In write mode: the client will simply send every request from the operation queue.

Since deletes always win they can safely be replayed over and over. The object won’t get any more deleted.

Although there are multiple concepts involved, we have found this system for idempotency to be very easy to understand and very reliable. What really pays is to consider the system in it’s entirety (from client to server) to come up with the best possible solution. Stripe’s is very elegant and could work for a certain type of app. Our solution is tailored for our long term, long range duplicate request issues. I hope this is helpful in some way if you are thinking about these things for your app.

Make workflows for complicated tasks

2015-01-26T20:52:02-08:00

I’ve been extracting a lot of controller code into simple POROs recently, but it’s become more and more difficult and repetitive to get things to work consistently. I end up doing a lot of if statements in the #call method to manage failure states. An example might be:

# in a controller
def create
  @amount = params[:amount].to_i
  unless @amount > 100
    render :error and return
  end

  begin
    @charge = Stripe::Charge.create({
        amount: @amount,
        card: params[:card_token]
      })
  rescue Stripe::CardError => e
    render :error and return
  end

  @charge_response = StripeChargeResponse.new(body: @charge.to_hash)
  @payment = Payment.new({
    stripe_charge_response: charge_response,
    stripe_charge_id: charge.id,
    amount: charge.amount,
    currency: charge.currency
  })

  begin
    @payment.save!
  rescue ActiveRecord::RecordInvalid => e
    render :error and return
  end

  ReceiptMailer.payment_receipt(@payment).deliver_later

  redirect_to receipt_path(@payment)
end

After some work, I decided I should extract each step of the operation into it’s own method:

# in a controller
def create
  if grab_amount &&
     charge &&
     save_payment &&
     send_email
    redirect_to receipt_path(@payment)
  else
    render :error
  end
end

def grab_amount
  @amount = params[:amount].to_i
  @amount > 100
end

def charge
  @charge = Stripe::Charge.create({
    amount: @amount,
    card: params[:card_token]
  })
  true
rescue Stripe::CardError => e
end

def save_payment
  @charge_response = StripeChargeResponse.new(body: @charge.to_hash)
  @payment = Payment.new({
    stripe_charge_response: charge_response,
    stripe_charge_id: charge.id,
    amount: charge.amount,
    currency: charge.currency
  })

  @payment.save!
rescue ActiveRecord::RecordInvalid => e
end

def send_email
  ReceiptMailer.payment_receipt(@payment).deliver_later
  true
end

After that, I realized if I had a method to wrap and capture failures I could cleanup things even more:

# in a controller
def create
  if grab_amount &&
     charge &&
     save_payment &&
     send_email
    redirect_to receipt_path(@payment)
  else
    render :error
  end
end

def define_and_capture
  yield
rescue StandardError => e
  false
end

define_and_capture :grab_amount do
  @amount = params[:amount].to_i
  @amount > 100
end

define_and_capture :charge do
  @charge = Stripe::Charge.create({
    amount: @amount,
    card: params[:card_token]
  })
  true
end

define_and_capture :save_payment do
  @charge_response = StripeChargeResponse.new(body: @charge.to_hash)
  @payment = Payment.new({
    stripe_charge_response: charge_response,
    stripe_charge_id: charge.id,
    amount: charge.amount,
    currency: charge.currency
  })

  @payment.save!
end

define_and_capture :send_email do
  ReceiptMailer.payment_receipt(@payment).deliver_later
  true
end

And this is way too much going on in the controller, IMHO. So making a service object for this is pretty simple:

# in a controller
def create
  @service = ChargeACard.new(params[:amount], params[:card_token])
  if @service.call
    redirect_to receipt_path(@service.payment)
  else
    render :error
  end
end

# in it's own file
class ChargeACard
  attr_reader :payment

  def initialize(amount, card_token)
    @amount = amount
    @card_token = card_token
  end

  def call
    if grab_amount &&
       charge &&
       save_payment &&
       send_email
      true
    else
      false
    end
  end

  def define_and_capture
    yield
  rescue StandardError => e
    false
  end

  define_and_capture :grab_amount do
    @amount = params[:amount].to_i
    @amount > 100
  end

  define_and_capture :charge do
    @charge = Stripe::Charge.create({
      amount: @amount,
      card: params[:card_token]
    })
    true
  end

  define_and_capture :save_payment do
    @charge_response = StripeChargeResponse.new(body: @charge.to_hash)
    @payment = Payment.new({
      stripe_charge_response: charge_response,
      stripe_charge_id: charge.id,
      amount: charge.amount,
      currency: charge.currency
    })

    @payment.save!
  end

  define_and_capture :send_email do
    ReceiptMailer.payment_receipt(@payment).deliver_later
    true
  end
end

I iterated on this more and then decided I should just package up the repeatable bits into a module which I am now publishing as a gem: Workout.

Workout can help declare the steps needed to work through something. If any step fails then execution halts. A workflow instance knows if it’s completed, valid, or successful. This means a lot of controller actions can return to the simple and amazing if success then render success, else render error.

Most service object type libraries I see online accept their arguments into the call method, but I don’t like this approach. I’ve made the mistake of setting instance vars in methods and those might get carried over. To me, a better approach is to always Thing.new(args).call each time instead.

I hope someone might also find this type of object useful.

Object Oriented System Architecture

2015-01-23T15:41:40-08:00

Building large systems to process web requests, work jobs, and do other things can be daunting without a plan of attack or a system-of-thought. How many components to build, how to separate responsibilities, and when to build small or big are questions that come up over and over again. Having a framework to answer questions is a huge help and keeps things consistent, especially when working on a team.

The best way I’ve found to describe my system-of-thought for building large systems is object oriented system architecture. This means to loosely apply object oriented software design principles to the macro-level of a system’s architecture. Basically: micro-services. The term micro-service is pretty vague now-a-days, so I feel it’s important to be more specific.

Over the next few months I plan to take the time to describe different principles, scenarios, and ideas about how to build large systems. The basics are:

An object is a service
Build small, well-defined objects
Objects collaborate by passing messages
Model different problem domains for the system as discrete objects
Expose well-defined attributes and methods
Document and distinguish query and command methods
Objects own their data, which includes their data-store

A simple example would be a to-do list (I know, I know). There would need to be a lists service and a tasks service at a minimum. The lists service cannot talk to the tasks data-store, instead it must query the tasks service by passing a message to a query method. Things can be broken down even further by separating task reads, task writes, and tasks storage. It is not unthinkable to have a TasksGet, TasksSave, and TasksRepository object in a class based programming language, therefore it’s not unthinkable to have a tasks-get, tasks-save, and tasks-repository service as part of the system architecture.

If I built a service like this then the tasks service would expose an HTTP JSON api for all queries and commands. I would also have an asynchronous queue to facilitate cleanup work, cascade deletes, etc. The tasks service might message itself to cascade a change from one set of tasks to another by enqueuing a message onto a queue for a worker to later pickup.

By thinking on the macro-level about objects, one is free to think about the micro-level more simply. Most small, well-defined services don’t need large software object hierarchies. Each query and command could be defined as a single function, possibly in a very functional language, as long as it’s exposed in a well defined way to the rest of the system.

Inheritance is even possible by delegating calls from one service to another. A completed-tasks-get service might proxy every message to the tasks-get service, then filter out all non-completed tasks before returning results. Deciding if completed tasks are another service or a method on an existing service is similar to deciding if one should add a method to a class or object or to write a simple delegate object wrapper.

Code is supposed to express the intent of the program and this is more easily achieved if the service has one clearly defined purpose. A service might need sophisticated permissions rules so one might choose a language better at expressing lists of rules or logic programming. Some services might just read and return data with minimal transformation and be represented as a simple function conceptually as well as in code.

I’ve began work on a site: oosa.info.

I’d like to use this site to document and discuss object oriented design, micro-services, and system architecture in general there. I’ve setup a simple homepage that links to a forum for now. I plan to add articles, tools and libraries, and link to popular newsletters. If you want to help, feel free to fork the website.

Always use a CDN

2015-01-17T16:11:55-08:00

Doing maintenance on a 6 year old project today that “didn’t have the budget” for a [CDN]((https://en.wikipedia.org/wiki/Content_delivery_network)) reminded me how important one can be. I’m writing this to remind myself to stick to my guns and always make time for the important things: one of which is using Cloudfront or Cloudflare every time.

Not every site needs to “scale” (whatever that means), but it’s a complete waste of resources to keep answering requests for the same files over and over again. If a project is hosted on heroku, then there is no web server in front of the application to intercept requests for files. The application has to answer and handle the same requests over and over again. Even if an app is behind a reverse proxy like apache or nginx, the proxy is still answering and streaming files when it doesn’t need to.

Don’t under estimate just how easy and useful it is to save money by reducing the requests per second hitting the application boxes. While your site is keeping up fine now, when a spike happens it can deal with more traffic if it only has to serve the dynamic bits. If a page has 4 assets, then a 100% increase is page views could mean a 400% increase in requests per second. A page with lots of logos or icons or whatever might have 100s of assets and every page view is multiplied that much. The simple math is: don’t serve assets yourself.

If you’re willing to repoint the primary domain’s DNS then using Cloudflare is super simple. Sign up, then change DNS settings. Sadly, Cloudflare does not provide a hostname to use, so letting them proxy the entire domain is the only way.

My favorite CDN is Amazon’s Cloudfront and it’s amazingly simple to setup. All CDN’s work the same: the origin server is the application’s primary domain. Amazon provides a unique hostname to use, just append the original file’s path. When Amazon doesn’t have a file in it’s cache it will ask the origin domain for a fresh copy at that path. All one has to do is prepend the hostname like: https://abcdefg123.cloudfront.net/images/cats.gif. Another great thing is Amazon provides SSL for all Cloudfront subdomains, so no mixed content warnings or anything like that.

I just recently set this up for a rails project hosted on heroku by simply making this change to config/production.rb:

config.action_controller.asset_host = 
  "https://abcdefg123.cloudfront.net"

To make things easier for myself I just set it to always use https even if the page including the asset is http. It’s possible to use URLs like //abcdefg123.cloudfront.net/images/cats.gif without the protocol which tells the browser “use the same protocol as the base document” kinda sorta. Try it out if you haven’t.

For any other type of application or framework it’s pretty simple to do something like this pseudo code:

module.exports = function asset_path(path) {
  if ($application.env === "production") {
    return $application.config.asset_host + path;
  } else {
    return path;
}

(Yeah, that’s javascript referencing a global named $application. Sorry.) The basic idea is to make sure every time the path is output for an asset that it is “rendered” so in production it can use a different host.

Since Cloudflare takes over the DNS they let any request for a file not currently cached to pass through. This means less configuration up front, but it can lead to confusion if it’s a person’s first time using a CDN. It’s possible to configure Cloudfront to take over the origin domain as well, but it’s not required with Amazon’s setup.

After setting up a CDN and making sure all assets have the correct hostname in the URLs, it’s important to set the correct headers so the CDN knows how long to keep the files. Heroku has a great overview of HTTP caching and you should read it. The simplest thing to do is to add a Cache-Control header for every asset requested by the CDN backend:

Cache-Control:public, max-age=31536000

This is precisely what rails does when it serves an asset in production, so I got pretty lucky there. All good web frameworks will do the same thing. If you’re project is framework-less or using some new-awesome-hipster-cool-thing-framework then it should be simple to add this header when an asset is requested: if it’s not then provide a patch or module or re-consider your choice.

Finally, check the server access logs to see the new found amazingness! Offloading traffic can get way more money out of the application boxes and might even allow one to scale down. Not enough writing online talks about scaling down instead of up.

Why make a Mash?

2014-12-19T07:45:39-08:00

Hashie is fine #

Recently, Richard Schneeman wrote a very good article titled Hashie Considered Harmful - An Ode to Hash and OpenStruct. Give it a read, there is some wisdom there. However, I have a bit of a different take on this issue. I’ve also had this as a draft in Svbtle for way too long.

First, let’s get this straight: if OpenStruct is useful then Hashie::Mash is useful too. And OpenStruct is really useful. Also, don’t let anyone tell you “you don’t need a hash-like object that responds to methods” because you very well might need it. Always contrast your goals and the implementation of a library to make sure it’s as simple as it could be.

Second, don’t take advice about what to use from people who can’t explain the pain or joy around it. It’s like someone who says to use postgres instead of mysql, but has no clear reason to prefer anything. What is the real pain here? What is the real benefit? What circumstances were there?

To be clear, Richard explains that misspellings, insensitive access to hash-like object keys, and increased memory usage can cause issues, and he is correct. However, from certain perspectives, the tools shouldn’t try to help with misspellings at all: Javascript objects don’t raise on missing keys. Memory usage is relative to each application and my libs are not generally my problem, so we differ here too.

I will try to detail why I always tell people to use what they want, but they probably don’t need Hashie anyway.

Why not to use `Hashie` #

There is one very good reason not to use Hashie::Mash at all that I don’t see explained very often: #zip.

$ gem install hashie
$ irb -rhashie

address = Hashie::Mash.new(street: "100 Street St", city: "city", zip: 10119)
address.zip # => [[["street", "100 Street St"]], [["city", "city"]], [["zip", 10119]]]

It’s honestly that simple. Mash inherits from Hash which includes Enumerable and you have a huge number of keys (175) that have surprising behavior. However, this does not mean Hashie is bad or not useful, it’s just how it works and one needs to know that.

What to use instead #

Maybe `OpenStruct` is “better” #

No, OpenStruct is slow. Let’s see some data comparing it to Hashie:

$ gem install benchmark-ips hashie
$ irb -rbenchmark/ips -rhashie -rostruct

Benchmark.ips do |x|
  x.report("ostruct") { OpenStruct.new(street: "100 Street St").street }
  x.report("hashie") { Hashie::Mash.new(street: "100 Street St").street }
end

Calculating -------------------------------------
             ostruct    12.509k i/100ms
              hashie    23.823k i/100ms
-------------------------------------------------
             ostruct    135.329k (± 7.8%) i/s -    675.486k
              hashie    313.649k (± 5.9%) i/s -      1.572M

Hashie is at least twice as fast for the simple case of building a hash-like object and calling a method on it. This is what my normal usage of these tools looks like, grab some data and call methods on the resulting objects.

Oh, why would one use “OpenStruct” then? #

OpenStruct compiles the method into the instance so repeated calls will be fast. Here is what that looks like:

$ gem install benchmark-ips hashie
$ $ irb -rbenchmark/ips -rhashie -rostruct

Benchmark.ips do |x|
  x.report("ostruct") {
    o = OpenStruct.new(street: "100 Street St")
    100.times { o.street }
  }
  x.report("hashie") {
    m = Hashie::Mash.new(street: "100 Street St")
    100.times { m.street }
    }
end

Calculating -------------------------------------
             ostruct     4.563k i/100ms
              hashie     1.363k i/100ms
-------------------------------------------------
             ostruct     46.592k (± 3.7%) i/s -    232.713k
              hashie     13.598k (± 4.1%) i/s -     68.150k

OpenStruct is over three times faster for repeated calls to keys. So for long lived objects, OpenStruct is way better than Hashie. However, there is something even better for long lived objects: Struct. If your objects are really that long lived you will probably know their schema and you can just make a class (Struct is a class factory, so use it) that conforms to that schema.

What does this mean? #

What it always means: the tools one chooses to use should be tailored to the use case.

I build a lot of apis and those apis all produce and consume JSON which in ruby is best represented as Hash’s or Array’s of Hash’s. However, one of these lines of code is prettier:

task_ids = tasks.map(&:id)
task_ids = tasks.map { |task| task[“id”] }

There are other examples too where using methods is much preferred from a stylistic point of view. My apis change a lot at first, so dynamically providing the Hash#keys as methods allows me to move quicker. It’s possible that eventually I would define a Struct for each version of each api later, which is an easy refactor since the tests all still pass because nothing really changes.

If we shouldn’t use Hashie and OpenStruct is slow, what do we do?

I made my own Mash #

Yeah, I know, NIH and all that. But, as I typed above, evaluate tools on what they are being or will be used for. For my api producing/consuming applications I need hash wrappers that are fast and use very little memory. These wrapped objects are not long lived.

My library is called Mashed. It does three things: provides indifferent access in a predictable way, provides a hash wrapper that has a very small method footprint, and represents the internal hash’s keys as methods.

Indifferent access #

Symbols in ruby are kinda annoying. Until 2.2 (due out very soon I guess) they are not garbage collected, so technically you could allow anyone to DDOS your app by making every JSON object into a symbolized hash. (I’ve considered just going down this road and making sure I monitor my app correctly, but I’ve never actually done it.) Luckily this will go away when they are garbage collected and I will even change my implementation when that happens.

But for now, I created the StringyHash. It does not inherit from Hash, but instead wraps and delegates to a hash instance. The method footprint is small and it doesn’t extend any built-in ruby classes at all.

The example from the README should explain it all:

StringyHash = Mash::StringyHash

sh = StringyHash.new(title: "Hello", starred: false, completed_at: nil)
sh.keys # => ["title", "starred", "completed_at"]
sh[:title] # => "Hello"
sh["title"] # => "Hello"

class Title
  def to_s
    "title"
  end
end

sh[Title.new] # => "Hello"

The goal is to be a very sensible delegator to the internal hash instance. I’ve had zero issues so far with this in many production systems. For 2.2 I will make a SymbolizedHash class I guess.

Wrapper with very small method footprint #

In ruby, every object has a lot of built-in methods.

Object.new.methods.count # => 55

Every ruby object has at least 55 methods. If the goal is to provide almost any key that might be set in a Hash as a method, that is 55 keys that are impossible to get to. Luckily, ruby allows one to start from a smaller point with BasicObject.

BasicObject.new.methods.count # =>
# NoMethodError: undefined method `methods' for #<BasicObject:0x007fd454b8fdc8>

That’s right, it doesn’t even know what it’s list of methods are. My Mash inherits from BasicObject and provides a very small amount of built-in methods.

$ gem install mashed
$ irb -rmashed

Mashed::Mash.new({}).methods.count # => 26

I’m always trying to get that number lower as well. Please, if you ever have ideas for how to do that then make a PR or Issue.

Delegate methods to key/value lookups #

Now, how does Mash fare in the #zip example:

$ gem install mashed
$ irb -rmashed

address = Mashed::Mash.new(street: "100 Street St", city: "city", zip: 10119)
address.zip # => 10119

It works in an unsurprising manner. The “secret” to Mash being a good citizen is for it not to be hash-like at all.

Examples:

address["zip"] # =>
# NoMethodError: private method `[]' called for #<Mashed::Mash:0x007fb501049cd8>

address.merge(state: "VA") # =>
# NoMethodError: undefined method `merge' for #<Mashed::Mash:0x007fb501049cd8>

address.map { |k,v| puts "#{k}: #{v}" } # =>
# NoMethodError: undefined method `map' for #<Mashed::Mash:0x007fb501049cd8>

address.inspect # => "#<Mashed::Mash @hash=>{\"street\"=>\"100 Street St\", \"city\"=>\"city\", \"zip\"=>10119}>"

It just refused to appear to be a Hash.

There are still problems, one of which is an issue right now: single method calls with zero arguments return nil if the key is missing. This is inevitable based on the current design constraints: Mash acts like a Javascript Object where missing keys are undefined.

I find this unsurprising since accessing a missing key on a Hash returns nil. However, I am considering making a monad or something to possibly make it easier to understand.

Why is `Hashie` not like `Mashed`? #

Because it’s a different tool. Hashie is actually a great library and everyone should not only try to use it at least once, but read through it’s code. You can learn a ton by seeing how other’s have solved similar problems.

OpenStruct is awesome too. If you’re making a ruby script and you want to have no dependencies outside the standard library then use it; this happens to me when I’m working on build or deployment scripts.

Use what works for you’re current situation #

Write tests, evaluate libraries based on their implementation and api, and don’t listen to anyone including me (: