tag:myobie.svbtle.com,2014:/feedNathan Herald2015-03-18T03:38:15-07:00Nathan Heraldhttps://myobie.svbtle.comSvbtle.comtag:myobie.svbtle.com,2014:Post/my-core-values2015-03-18T03:38:15-07:002015-03-18T03:38:15-07:00My core values<ul>
<li>Other’s needs and wants are more important than your own</li>
<li>Surprise and delight those you connect with</li>
<li>Thank people regularly for what they do</li>
<li>Take risks from time to time</li>
<li>Ambitiously pursue solutions for hard problems</li>
<li>Positively encourage other’s ideas and aspirations</li>
<li>Be content, but not complacent with how things are currently</li>
<li>Prioritize calmly, work aggressively </li>
<li>Gladly do the work other’s don’t want</li>
<li>Focus on details while considering a work holistically</li>
</ul>
tag:myobie.svbtle.com,2014:Post/how-we-allow-any-request-to-be-safely-repeated-at-anytime-for-wunderlist2015-02-18T08:02:00-08:002015-02-18T08:02:00-08:00How we allow any request to be safely repeated at anytime for Wunderlist<p>Stripe recently introduced an <a href="https://stripe.com/docs/api#idempotent_requests">Idempotent requests feature</a> for their api calls to protect against duplicate charges caused by network failures. While building Wunderlist 3 we needed a similar mechanism for our remote clients to be able to be safely repeat any requests. Mobile networks, corporate firewalls, and general internet connectivity issues are problems way too often. One cannot be sure an operation has successfully arrived without receiving a full, parseable response.</p>
<p>We took a different route to protecting our data and I’d like to detail it for future reference. It could be useful for other’s to compare and contrast Stripe’s and Wunderlist’s methods for idempotent requests.</p>
<p>With stripe, one includes an <code class="prettyprint">Idempotency-Key</code> header with every request. If a request comes in within the next 24 hours with the same header value then Stripe will return the original response. One could build something like this utilizing redis or memcached to hold all raw responses keyed by the idempotency header’s value with a TTL. 24 hours makes sense for stripe since their clients are mostly background workers on servers. Surely even 1 hour would probably be enough, since most server workers are always working and would retry very quickly.</p>
<p>For Wunderlist, we have clients in remote places that sometimes are offline for days. A user might use their iPhone constantly for weeks, but only open their iPad once every week. Or a person might be on a web client at work, but a phone at home. Instead of using an in-memory cache of responses, we decided to let our databases do most of the work for us. I always favor the dumbest solution that works and so that is what we did (:</p>
<p>All GET requests (which are always reads) to the Wunderlist apis are safe already, so that part is easy.</p>
<p>For write operations every Wunderlist client maintains an an operation queue that is serializable to disk. Every write request is first added to the operation queue. Then, on the other side, an operation worker marks items as in-progress, finished, etc as it processes the requests. Right before a client is quit: it writes the current state to disk. This state is restored into memory on app boot. A request could be cut short by network issues, memory pressure, or just the use quitting the app. The request could have succeeded or it could have never made it to the server, it’s impossible for the client to know.</p>
<p>POSTs (which are always creates) are the hardest since the client has zero knowledge about the server’s state. PUTs (which are updates) are easier, since the client already knowns the server’s identifier for the object being updated. And we decided DELETEs always win and always succeed.</p>
<p>PUTs are easy because our sync algorithm already handles them. Every object in the Wunderlist system has an identifier and a revision. Every update to an object safely increments it’s revision counter by at least 1. Anytime a client updates an object it must send what it knows to be the current revision and identifier along with the properties to update. If the client’s assumed current revision is incorrect, the server immediately rejects the request and asks the client to fetch and merge the newest information. I am not ashamed to say this is identical to how Couch DB works, which is where we got the idea. If a client were to replay a successful update, it would fail because the revision would have changed and the client would be out of date. The client would fetch the new information, merge that in, notice nothing is out of date, and reconcile the operation in the queue to be completed.</p>
<p>For all POST requests, the client generates a header we named <code class="prettyprint">X-Client-Request-ID</code> which is basically the same as stripe’s <code class="prettyprint">Idempotent-Key</code> header. This request id is generated as part of the request operation that is added to the in-memory queue, this way it is serialized to and from disk as needed. This means a client can attempt to replay a POST request even if the app was offline for a long time.</p>
<p>On the server side, we let the database do the hard work for us. Every table for all synced objects has the aforementioned <code class="prettyprint">id</code> and <code class="prettyprint">revision</code> columns along with a <code class="prettyprint">created_by_request_id</code> column. The request id column has a unique constraint, so there can never be two. To prevent clients from stomping on top of each other’s request id’s, other metadata like the current user’s id and the current device’s identifier are prepended to the already random request id. Since the operation queue is not synced between devices that is enough to keep this super unique.</p>
<p>Our apis return the <code class="prettyprint">created_by_request_id</code> field in the object’s json so that clients can reconcile any operations currently stuck in their operation queue without having to retry the request.</p>
<p>All Wunderlist clients only perform GETs during a sync and only perform any writes when flushing the operation queue. If a client is currently in read mode then it will scan through the operation queue for each newly downloaded object’s request id, attempting to reconcile an in-memory request if one is found. In write mode: the client will simply send every request from the operation queue.</p>
<p>Since deletes always win they can safely be replayed over and over. The object won’t get any more deleted.</p>
<p>Although there are multiple concepts involved, we have found this system for idempotency to be very easy to understand and very reliable. What really pays is to consider the system in it’s entirety (from client to server) to come up with the best possible solution. Stripe’s is very elegant and could work for a certain type of app. Our solution is tailored for our long term, long range duplicate request issues. I hope this is helpful in some way if you are thinking about these things for your app. </p>
tag:myobie.svbtle.com,2014:Post/make-workflows-for-complicated-tasks2015-01-26T20:52:02-08:002015-01-26T20:52:02-08:00Make workflows for complicated tasks<p>I’ve been extracting a lot of controller code into simple POROs recently, but it’s become more and more difficult and repetitive to get things to work consistently. I end up doing a lot of <code class="prettyprint">if</code> statements in the <code class="prettyprint">#call</code> method to manage failure states. An example might be:</p>
<pre><code class="prettyprint"># in a controller
def create
@amount = params[:amount].to_i
unless @amount > 100
render :error and return
end
begin
@charge = Stripe::Charge.create({
amount: @amount,
card: params[:card_token]
})
rescue Stripe::CardError => e
render :error and return
end
@charge_response = StripeChargeResponse.new(body: @charge.to_hash)
@payment = Payment.new({
stripe_charge_response: charge_response,
stripe_charge_id: charge.id,
amount: charge.amount,
currency: charge.currency
})
begin
@payment.save!
rescue ActiveRecord::RecordInvalid => e
render :error and return
end
ReceiptMailer.payment_receipt(@payment).deliver_later
redirect_to receipt_path(@payment)
end
</code></pre>
<p>After some work, I decided I should extract each step of the operation into it’s own method:</p>
<pre><code class="prettyprint"># in a controller
def create
if grab_amount &&
charge &&
save_payment &&
send_email
redirect_to receipt_path(@payment)
else
render :error
end
end
def grab_amount
@amount = params[:amount].to_i
@amount > 100
end
def charge
@charge = Stripe::Charge.create({
amount: @amount,
card: params[:card_token]
})
true
rescue Stripe::CardError => e
end
def save_payment
@charge_response = StripeChargeResponse.new(body: @charge.to_hash)
@payment = Payment.new({
stripe_charge_response: charge_response,
stripe_charge_id: charge.id,
amount: charge.amount,
currency: charge.currency
})
@payment.save!
rescue ActiveRecord::RecordInvalid => e
end
def send_email
ReceiptMailer.payment_receipt(@payment).deliver_later
true
end
</code></pre>
<p>After that, I realized if I had a method to wrap and capture failures I could cleanup things even more:</p>
<pre><code class="prettyprint"># in a controller
def create
if grab_amount &&
charge &&
save_payment &&
send_email
redirect_to receipt_path(@payment)
else
render :error
end
end
def define_and_capture
yield
rescue StandardError => e
false
end
define_and_capture :grab_amount do
@amount = params[:amount].to_i
@amount > 100
end
define_and_capture :charge do
@charge = Stripe::Charge.create({
amount: @amount,
card: params[:card_token]
})
true
end
define_and_capture :save_payment do
@charge_response = StripeChargeResponse.new(body: @charge.to_hash)
@payment = Payment.new({
stripe_charge_response: charge_response,
stripe_charge_id: charge.id,
amount: charge.amount,
currency: charge.currency
})
@payment.save!
end
define_and_capture :send_email do
ReceiptMailer.payment_receipt(@payment).deliver_later
true
end
</code></pre>
<p>And this is way too much going on in the controller, IMHO. So making a service object for this is pretty simple:</p>
<pre><code class="prettyprint"># in a controller
def create
@service = ChargeACard.new(params[:amount], params[:card_token])
if @service.call
redirect_to receipt_path(@service.payment)
else
render :error
end
end
# in it's own file
class ChargeACard
attr_reader :payment
def initialize(amount, card_token)
@amount = amount
@card_token = card_token
end
def call
if grab_amount &&
charge &&
save_payment &&
send_email
true
else
false
end
end
def define_and_capture
yield
rescue StandardError => e
false
end
define_and_capture :grab_amount do
@amount = params[:amount].to_i
@amount > 100
end
define_and_capture :charge do
@charge = Stripe::Charge.create({
amount: @amount,
card: params[:card_token]
})
true
end
define_and_capture :save_payment do
@charge_response = StripeChargeResponse.new(body: @charge.to_hash)
@payment = Payment.new({
stripe_charge_response: charge_response,
stripe_charge_id: charge.id,
amount: charge.amount,
currency: charge.currency
})
@payment.save!
end
define_and_capture :send_email do
ReceiptMailer.payment_receipt(@payment).deliver_later
true
end
end
</code></pre>
<p>I iterated on this more and then decided I should just package up the repeatable bits into a module which I am now publishing as a gem: <a href="https://github.com/myobie/workout">Workout</a>.</p>
<p>Workout can help declare the steps needed to work through something. If any step fails then execution halts. A workflow instance knows if it’s completed, valid, or successful. This means a lot of controller actions can return to the simple and amazing if success then render success, else render error. </p>
<p>Most service object type libraries I see online accept their arguments into the call method, but I don’t like this approach. I’ve made the mistake of setting instance vars in methods and those might get carried over. To me, a better approach is to always <code class="prettyprint">Thing.new(args).call</code> each time instead.</p>
<p>I hope someone might also find this type of object useful.</p>
tag:myobie.svbtle.com,2014:Post/object-oriented-server-architecture-12015-01-23T15:41:40-08:002015-01-23T15:41:40-08:00Object Oriented System Architecture<p>Building large systems to process web requests, work jobs, and do other things can be daunting without a plan of attack or a system-of-thought. How many components to build, how to separate responsibilities, and when to build small or big are questions that come up over and over again. Having a framework to answer questions is a huge help and keeps things consistent, especially when working on a team.</p>
<p>The best way I’ve found to describe my system-of-thought for building large systems is object oriented system architecture. This means to loosely apply object oriented software design principles to the macro-level of a system’s architecture. Basically: micro-services. The term micro-service is pretty vague now-a-days, so I feel it’s important to be more specific.</p>
<p>Over the next few months I plan to take the time to describe different principles, scenarios, and ideas about how to build large systems. The basics are:</p>
<ul>
<li>An object is a service</li>
<li>Build small, well-defined objects</li>
<li>Objects collaborate by passing messages</li>
<li>Model different problem domains for the system as discrete objects</li>
<li>Expose well-defined attributes and methods</li>
<li>Document and distinguish query and command methods</li>
<li>Objects own their data, which includes their data-store</li>
</ul>
<p>A simple example would be a to-do list (I know, I know). There would need to be a lists service and a tasks service at a minimum. The lists service cannot talk to the tasks data-store, instead it must query the tasks service by passing a message to a query method. Things can be broken down even further by separating task reads, task writes, and tasks storage. It is not unthinkable to have a TasksGet, TasksSave, and TasksRepository object in a class based programming language, therefore it’s not unthinkable to have a tasks-get, tasks-save, and tasks-repository service as part of the system architecture.</p>
<p>If I built a service like this then the tasks service would expose an HTTP JSON api for all queries and commands. I would also have an asynchronous queue to facilitate cleanup work, cascade deletes, etc. The tasks service might message itself to cascade a change from one set of tasks to another by enqueuing a message onto a queue for a worker to later pickup.</p>
<p>By thinking on the macro-level about objects, one is free to think about the micro-level more simply. Most small, well-defined services don’t need large software object hierarchies. Each query and command could be defined as a single function, possibly in a very functional language, as long as it’s exposed in a well defined way to the rest of the system. </p>
<p>Inheritance is even possible by delegating calls from one service to another. A completed-tasks-get service might proxy every message to the tasks-get service, then filter out all non-completed tasks before returning results. Deciding if completed tasks are another service or a method on an existing service is similar to deciding if one should add a method to a class or object or to write a simple delegate object wrapper.</p>
<p>Code is supposed to express the intent of the program and this is more easily achieved if the service has one clearly defined purpose. A service might need sophisticated permissions rules so one might choose a language better at expressing lists of rules or logic programming. Some services might just read and return data with minimal transformation and be represented as a simple function conceptually as well as in code.</p>
<p>I’ve began work on a site: <a href="http://oosa.info">oosa.info</a>.</p>
<p>I’d like to use this site to document and discuss object oriented design, micro-services, and system architecture in general there. I’ve setup a simple homepage that links to a forum for now. I plan to add articles, tools and libraries, and link to popular newsletters. If you want to help, feel free to <a href="https://github.com/myobie/oosa.info">fork the website</a>.</p>
tag:myobie.svbtle.com,2014:Post/use-a-cdn-for-every-site-you-build2015-01-17T16:11:55-08:002015-01-17T16:11:55-08:00Always use a CDN<p>Doing maintenance on a 6 year old project today that “didn’t have the budget” for a [CDN]((<a href="https://en.wikipedia.org/wiki/Content_delivery_network)">https://en.wikipedia.org/wiki/Content_delivery_network)</a>) reminded me how important one can be. I’m writing this to remind myself to stick to my guns and always make time for the important things: one of which is using <a href="https://aws.amazon.com/cloudfront/">Cloudfront</a> or <a href="https://www.cloudflare.com">Cloudflare</a> every time.</p>
<p>Not every site needs to “scale” (whatever that means), but it’s a complete waste of resources to keep answering requests for the same files over and over again. If a project is hosted on heroku, then there is <a href="https://devcenter.heroku.com/articles/http-caching">no web server in front of the application</a> to intercept requests for files. The application has to answer and handle the same requests over and over again. Even if an app is behind a reverse proxy like apache or nginx, the proxy is still answering and streaming files when it doesn’t need to.</p>
<p>Don’t under estimate just how easy and useful it is to save money by reducing the requests per second hitting the application boxes. While your site is keeping up fine now, when a spike happens it can deal with more traffic if it only has to serve the dynamic bits. If a page has 4 assets, then a 100% increase is page views could mean a 400% increase in requests per second. A page with lots of logos or icons or whatever might have 100s of assets and every page view is multiplied that much. The simple math is: don’t serve assets yourself.</p>
<p>If you’re willing to repoint the primary domain’s DNS then using Cloudflare is super simple. Sign up, then change DNS settings. Sadly, <a href="https://support.cloudflare.com/hc/en-us/articles/203689034-Does-CloudFlare-provide-me-with-a-CDN-subdomain-or-hostname-">Cloudflare does not provide a hostname</a> to use, so letting them proxy the entire domain is the only way.</p>
<p>My favorite CDN is Amazon’s Cloudfront and it’s amazingly simple to setup. All CDN’s work the same: the <strong>origin</strong> server is the application’s primary domain. Amazon provides a unique hostname to use, just append the original file’s path. When Amazon doesn’t have a file in it’s cache it will ask the origin domain for a fresh copy at that path. All one has to do is prepend the hostname like: <code class="prettyprint">https://abcdefg123.cloudfront.net/images/cats.gif</code>. Another great thing is Amazon provides SSL for all Cloudfront subdomains, so no mixed content warnings or anything like that.</p>
<p>I just recently set this up for a rails project hosted on heroku by simply making this change to <code class="prettyprint">config/production.rb</code>:</p>
<pre><code class="prettyprint">config.action_controller.asset_host =
"https://abcdefg123.cloudfront.net"
</code></pre>
<p>To make things easier for myself I just set it to always use <code class="prettyprint">https</code> even if the page including the asset is <code class="prettyprint">http</code>. It’s possible to use URLs like <code class="prettyprint">//abcdefg123.cloudfront.net/images/cats.gif</code> without the protocol which tells the browser “use the same protocol as the base document” kinda sorta. Try it out if you haven’t.</p>
<p>For any other type of application or framework it’s pretty simple to do something like this pseudo code:</p>
<pre><code class="prettyprint">module.exports = function asset_path(path) {
if ($application.env === "production") {
return $application.config.asset_host + path;
} else {
return path;
}
</code></pre>
<p>(Yeah, that’s javascript referencing a global named <code class="prettyprint">$application</code>. Sorry.) The basic idea is to make sure every time the path is output for an asset that it is “rendered” so in production it can use a different host.</p>
<p>Since Cloudflare takes over the DNS they let any request for a file not currently cached to pass through. This means less configuration up front, but it <u>can</u> lead to confusion if it’s a person’s first time using a CDN. It’s possible to configure Cloudfront to take over the origin domain as well, but it’s not required with Amazon’s setup.</p>
<p>After setting up a CDN and making sure all assets have the correct hostname in the URLs, it’s important to set the correct headers so the CDN knows how long to keep the files. <a href="https://devcenter.heroku.com/articles/increasing-application-performance-with-http-cache-headers">Heroku has a great overview of HTTP caching</a> and you should read it. The simplest thing to do is to add a <code class="prettyprint">Cache-Control</code> header for every asset requested by the CDN backend:</p>
<pre><code class="prettyprint">Cache-Control:public, max-age=31536000
</code></pre>
<p>This is precisely what rails does when it serves an asset in production, so I got pretty lucky there. All good web frameworks will do the same thing. If you’re project is framework-less or using some new-awesome-hipster-cool-thing-framework then it should be simple to add this header when an asset is requested: if it’s not then provide a patch or module or re-consider your choice.</p>
<p>Finally, <strong>check the server access logs to see the new found amazingness!</strong> Offloading traffic can get way more money out of the application boxes and might even allow one to scale down. Not enough writing online talks about scaling down instead of up.</p>
tag:myobie.svbtle.com,2014:Post/why-make-a-mash2014-12-19T07:45:39-08:002014-12-19T07:45:39-08:00Why make a Mash?<h1 id="a-hrefhttpsgithubcomintrideahashiehashiea-is_1">
<a href="https://github.com/intridea/hashie">Hashie</a> is fine <a class="head_anchor" href="#a-hrefhttpsgithubcomintrideahashiehashiea-is_1">#</a>
</h1>
<p>Recently, Richard Schneeman wrote a very good article titled <a href="http://www.schneems.com/2014/12/15/hashie-considered-harmful.html">Hashie Considered Harmful - An Ode to Hash and OpenStruct</a>. Give it a read, there is some wisdom there. However, I have a bit of a different take on this issue. I’ve also had this as a draft in Svbtle for way too long.</p>
<p>First, let’s get this straight: if <code class="prettyprint">OpenStruct</code> is useful then <code class="prettyprint">Hashie::Mash</code> is useful too. And <code class="prettyprint">OpenStruct</code> <strong>is</strong> really useful. Also, don’t let anyone tell you “you don’t need a hash-like object that responds to methods” because you very well might need it. Always contrast your goals and the implementation of a library to make sure it’s as simple as it could be.</p>
<p>Second, don’t take advice about what to use from people who can’t explain the pain or joy around it. It’s like someone who says to use postgres instead of mysql, but has no clear reason to prefer anything. What is the real pain here? What is the real benefit? What circumstances were there?</p>
<p>To be clear, Richard explains that misspellings, insensitive access to hash-like object keys, and increased memory usage can cause issues, and he is correct. However, from certain perspectives, the tools shouldn’t try to help with misspellings at all: Javascript objects don’t raise on missing keys. Memory usage is relative to each application and my libs are not generally my problem, so we differ here too.</p>
<p>I will try to detail why I always tell people to use what they want, but they probably don’t need <code class="prettyprint">Hashie</code> anyway.</p>
<h1 id="why-not-to-use-code-classprettyprinthashiecod_1">Why not to use <code class="prettyprint">Hashie</code> <a class="head_anchor" href="#why-not-to-use-code-classprettyprinthashiecod_1">#</a>
</h1>
<p>There is one very good reason not to use <code class="prettyprint">Hashie::Mash</code> at all that I don’t see explained very often: <code class="prettyprint">#zip</code>.</p>
<pre><code class="prettyprint lang-sh">$ gem install hashie
$ irb -rhashie
</code></pre>
<pre><code class="prettyprint lang-ruby">address = Hashie::Mash.new(street: "100 Street St", city: "city", zip: 10119)
address.zip # => [[["street", "100 Street St"]], [["city", "city"]], [["zip", 10119]]]
</code></pre>
<p>It’s honestly that simple. <code class="prettyprint">Mash</code> inherits from <code class="prettyprint">Hash</code> which includes <code class="prettyprint">Enumerable</code> and you have a huge number of keys (175) that have surprising behavior. However, this does not mean <code class="prettyprint">Hashie</code> is bad or not useful, it’s just how it works and one needs to know that. </p>
<h1 id="what-to-use-instead_1">What to use instead <a class="head_anchor" href="#what-to-use-instead_1">#</a>
</h1><h2 id="maybe-code-classprettyprintopenstructcode-is_2">Maybe <code class="prettyprint">OpenStruct</code> is “better” <a class="head_anchor" href="#maybe-code-classprettyprintopenstructcode-is_2">#</a>
</h2>
<p>No, <code class="prettyprint">OpenStruct</code> is slow. Let’s see some data comparing it to <code class="prettyprint">Hashie</code>:</p>
<pre><code class="prettyprint lang-sh">$ gem install benchmark-ips hashie
$ irb -rbenchmark/ips -rhashie -rostruct
</code></pre>
<pre><code class="prettyprint lang-ruby">Benchmark.ips do |x|
x.report("ostruct") { OpenStruct.new(street: "100 Street St").street }
x.report("hashie") { Hashie::Mash.new(street: "100 Street St").street }
end
</code></pre>
<pre><code class="prettyprint lang-txt">Calculating -------------------------------------
ostruct 12.509k i/100ms
hashie 23.823k i/100ms
-------------------------------------------------
ostruct 135.329k (± 7.8%) i/s - 675.486k
hashie 313.649k (± 5.9%) i/s - 1.572M
</code></pre>
<p><strong><code class="prettyprint">Hashie</code> is at least twice as fast</strong> for the simple case of building a hash-like object and calling a method on it. This is what my normal usage of these tools looks like, grab some data and call methods on the resulting objects.</p>
<h2 id="oh-why-would-one-use-openstruct-then_2">Oh, why would one use “OpenStruct” then? <a class="head_anchor" href="#oh-why-would-one-use-openstruct-then_2">#</a>
</h2>
<p><code class="prettyprint">OpenStruct</code> compiles the method into the instance so repeated calls will be fast. Here is what that looks like:</p>
<pre><code class="prettyprint lang-sh">$ gem install benchmark-ips hashie
$ $ irb -rbenchmark/ips -rhashie -rostruct
</code></pre>
<pre><code class="prettyprint lang-ruby">Benchmark.ips do |x|
x.report("ostruct") {
o = OpenStruct.new(street: "100 Street St")
100.times { o.street }
}
x.report("hashie") {
m = Hashie::Mash.new(street: "100 Street St")
100.times { m.street }
}
end
</code></pre>
<pre><code class="prettyprint lang-txt">Calculating -------------------------------------
ostruct 4.563k i/100ms
hashie 1.363k i/100ms
-------------------------------------------------
ostruct 46.592k (± 3.7%) i/s - 232.713k
hashie 13.598k (± 4.1%) i/s - 68.150k
</code></pre>
<p><strong><code class="prettyprint">OpenStruct</code> is over three times faster</strong> for repeated calls to keys. So for long lived objects, <code class="prettyprint">OpenStruct</code> is way better than <code class="prettyprint">Hashie</code>. However, there is something <strong>even better</strong> for long lived objects: <code class="prettyprint">Struct</code>. If your objects are really that long lived you will probably know their schema and you can just make a class (<code class="prettyprint">Struct</code> is a class factory, so use it) that conforms to that schema. </p>
<h2 id="what-does-this-mean_2">What does this mean? <a class="head_anchor" href="#what-does-this-mean_2">#</a>
</h2>
<p>What it always means: the tools one chooses to use should be tailored to the use case.</p>
<p>I build a lot of apis and those apis all produce and consume JSON which in ruby is best represented as <code class="prettyprint">Hash</code>’s or <code class="prettyprint">Array</code>’s of <code class="prettyprint">Hash</code>’s. However, one of these lines of code is prettier:</p>
<pre><code class="prettyprint lang-ruby">task_ids = tasks.map(&:id)
task_ids = tasks.map { |task| task[“id”] }
</code></pre>
<p>There are other examples too where using methods is much preferred from a stylistic point of view. My apis change a lot at first, so dynamically providing the <code class="prettyprint">Hash#keys</code> as methods allows me to move quicker. It’s possible that eventually I would define a <code class="prettyprint">Struct</code> for each version of each api later, which is an easy refactor since the tests all still pass because nothing really changes.</p>
<p>If we shouldn’t use <code class="prettyprint">Hashie</code> and <code class="prettyprint">OpenStruct</code> is slow, what do we do?</p>
<h1 id="i-made-my-own-mash_1">I made my own Mash <a class="head_anchor" href="#i-made-my-own-mash_1">#</a>
</h1>
<p>Yeah, I know, NIH and all that. But, as I typed above, evaluate tools on what they are being or will be used for. For my api producing/consuming applications I need hash wrappers that are fast and use very little memory. These wrapped objects are not long lived.</p>
<p>My library is called <a href="https://github.com/myobie/mashed"><code class="prettyprint">Mashed</code></a>. It does three things: provides indifferent access in a predictable way, provides a hash wrapper that has a very small method footprint, and represents the internal hash’s keys as methods.</p>
<h2 id="indifferent-access_2">Indifferent access <a class="head_anchor" href="#indifferent-access_2">#</a>
</h2>
<p>Symbols in ruby are kinda annoying. Until 2.2 (due out very soon I guess) they are not garbage collected, so technically you could allow anyone to DDOS your app by making every JSON object into a symbolized hash. (I’ve considered just going down this road and making sure I monitor my app correctly, but I’ve never actually done it.) Luckily this will go away when they are garbage collected and I will even change my implementation when that happens.</p>
<p>But for now, I created the <a href="https://github.com/myobie/mashed#stringyhash"><code class="prettyprint">StringyHash</code></a>. It <strong>does not</strong> inherit from <code class="prettyprint">Hash</code>, but instead wraps and delegates to a hash instance. The <a href="https://github.com/myobie/mashed/blob/master/lib/mashed/stringy_hash.rb">method footprint</a> is small and it doesn’t extend any built-in ruby classes at all.</p>
<p>The example from the README should explain it all:</p>
<pre><code class="prettyprint lang-ruby">StringyHash = Mash::StringyHash
sh = StringyHash.new(title: "Hello", starred: false, completed_at: nil)
sh.keys # => ["title", "starred", "completed_at"]
sh[:title] # => "Hello"
sh["title"] # => "Hello"
class Title
def to_s
"title"
end
end
sh[Title.new] # => "Hello"
</code></pre>
<p>The goal is to be a very sensible delegator to the internal hash instance. I’ve had zero issues so far with this in many production systems. For 2.2 I will make a <code class="prettyprint">SymbolizedHash</code> class I guess.</p>
<h2 id="wrapper-with-very-small-method-footprint_2">Wrapper with very small method footprint <a class="head_anchor" href="#wrapper-with-very-small-method-footprint_2">#</a>
</h2>
<p>In ruby, every object has a lot of built-in methods.</p>
<pre><code class="prettyprint lang-ruby">Object.new.methods.count # => 55
</code></pre>
<p>Every ruby object has at least 55 methods. If the goal is to provide almost any key that might be set in a <code class="prettyprint">Hash</code> as a method, that is 55 keys that are impossible to get to. Luckily, ruby allows one to start from a smaller point with <code class="prettyprint">BasicObject</code>.</p>
<pre><code class="prettyprint lang-ruby">BasicObject.new.methods.count # =>
# NoMethodError: undefined method `methods' for #<BasicObject:0x007fd454b8fdc8>
</code></pre>
<p>That’s right, it doesn’t even know what it’s list of methods are. My <a href="https://github.com/myobie/mashed/blob/master/lib/mashed/mash.rb"><code class="prettyprint">Mash</code></a> inherits from <code class="prettyprint">BasicObject</code> and provides a very small amount of built-in methods. </p>
<pre><code class="prettyprint lang-sh">$ gem install mashed
$ irb -rmashed
</code></pre>
<pre><code class="prettyprint lang-ruby">Mashed::Mash.new({}).methods.count # => 26
</code></pre>
<p>I’m always trying to get that number lower as well. Please, if you ever have ideas for how to do that then make a <a href="https://github.com/myobie/mashed/pulls">PR</a> or <a href="https://github.com/myobie/mashed/issues">Issue</a>.</p>
<h2 id="delegate-methods-to-keyvalue-lookups_2">Delegate methods to key/value lookups <a class="head_anchor" href="#delegate-methods-to-keyvalue-lookups_2">#</a>
</h2>
<p>Now, how does <code class="prettyprint">Mash</code> fare in the <code class="prettyprint">#zip</code> example:</p>
<pre><code class="prettyprint lang-sh">$ gem install mashed
$ irb -rmashed
</code></pre>
<pre><code class="prettyprint lang-ruby">address = Mashed::Mash.new(street: "100 Street St", city: "city", zip: 10119)
address.zip # => 10119
</code></pre>
<p>It works in an unsurprising manner. The “secret” to <code class="prettyprint">Mash</code> being a good citizen is for it not to be hash-like at all.</p>
<p>Examples:</p>
<pre><code class="prettyprint lang-ruby">address["zip"] # =>
# NoMethodError: private method `[]' called for #<Mashed::Mash:0x007fb501049cd8>
address.merge(state: "VA") # =>
# NoMethodError: undefined method `merge' for #<Mashed::Mash:0x007fb501049cd8>
address.map { |k,v| puts "#{k}: #{v}" } # =>
# NoMethodError: undefined method `map' for #<Mashed::Mash:0x007fb501049cd8>
address.inspect # => "#<Mashed::Mash @hash=>{\"street\"=>\"100 Street St\", \"city\"=>\"city\", \"zip\"=>10119}>"
</code></pre>
<p><strong>It just refused to appear to be a <code class="prettyprint">Hash</code>.</strong></p>
<p>There are still problems, one of which is <a href="https://github.com/myobie/mashed/issues/4">an issue right now</a>: single method calls with zero arguments return <code class="prettyprint">nil</code> if the key is missing. This is inevitable based on the current design constraints: <code class="prettyprint">Mash</code> acts like a Javascript <code class="prettyprint">Object</code> where missing keys are <code class="prettyprint">undefined</code>.</p>
<p>I find this unsurprising since accessing a missing key on a <code class="prettyprint">Hash</code> returns <code class="prettyprint">nil</code>. However, I am considering making a monad or something to possibly make it easier to understand.</p>
<h1 id="why-is-code-classprettyprinthashiecode-not-li_1">Why is <code class="prettyprint">Hashie</code> not like <code class="prettyprint">Mashed</code>? <a class="head_anchor" href="#why-is-code-classprettyprinthashiecode-not-li_1">#</a>
</h1>
<p>Because it’s a different tool. <code class="prettyprint">Hashie</code> is actually a great library and everyone should not only try to use it at least once, but read through it’s code. You can learn a ton by seeing how other’s have solved similar problems.</p>
<p><code class="prettyprint">OpenStruct</code> is awesome too. If you’re making a ruby script and you want to have no dependencies outside the standard library then use it; this happens to me when I’m working on build or deployment scripts.</p>
<h1 id="use-what-works-for-youre-current-situation_1">Use what works for you’re current situation <a class="head_anchor" href="#use-what-works-for-youre-current-situation_1">#</a>
</h1>
<p>Write tests, evaluate libraries based on their implementation and api, and don’t listen to anyone including me (:</p>