Now, the notes I made are rough, and it's been about 6 months since the work was performed, so I may miss some details. But, I wanted to share my process in hopes of figuring out where and how I could improve.
Some project background
The app was actually 2 Rails apps: 1 for front-end and 1 for the API. The front-end Rails app had custom controllers/models that represented API resources. The API app was rather standard Rails API setup.
The original developers were not available to discuss and walk through any code.
And, the project was rather low-traffic, so this wasn't an issue at massive scale. This gave me hope the issue would be easier to spot :)
Inspecting the database
As this was a totally unfamiliar app to me, and the original developers were not around to help walk-through the code, I tried what would be the lowest amount of effort to improve performance: SQL query performance tuning!
Luckily, for me, the app was hosted on Heroku and had a paid Heroku PostgreSQL database. Lucky, because you can simply click on the Database resource and then choose the "Diagnose" tab to inspect frequent and potentially slow queries. This will show you the SQL queries on the right and 2 graphs on the left. The graphs show Number of Invocations and Average Time Per Invocation. Here's a sample from a different app:
Short story: SQL did not appear to be the problem.
The first thing I noticed was this project was using an old version of Ruby, which was nearing end of life for support. And, many versions since touted performance gains. I made note of this as a last resort. Really, I didn't think updating runtime would or should solve the problem by an acceptable magnitude. Sure, perhaps some of the Ruby language improvements would help this project, but I doubted (still do to this day) that it would have been significant enough to call it a day.
After my generalized approaches failed to find anything significant, I moved on to a more specialized tool, Skylight. Skylight is a Ruby specific performance reporting service. And, thankfully, the client had Skylight enabled for a few weeks (IIRC they added it only after experiencing slow load times).
Just like the Heroku monitoring graphs, Skylight identified the same views as being slow. Where the advantage came in, was Skylight also showed Ruby/Rails specific runtime details, matching response times to Controller methods, rather than just URL endpoints.
Right away, I noticed something about "Object Allocations" and that they were through the freaking roof. This meant, Ruby was using wayyyy too much memory.
Narrowing in on the offending code
Skylight was nice enough to point me to the exact Controller method that was being called, which results in the massive amount of object allocations. But, it didn't tell me exact code (you know, like a line number!).
So I went hunting for "complex" looking code. Meaning, some function or method that has loops, maybe too many references being created, that sort of thing. Nothing stood out. I inspected the classes that were being inherited, didn't see anything obvious.
Okay, so I was closer to the source, but could not pinpoint it. I needed to profile the code locally.
Rack Mini Profiler
rack-mini-profiler [RMP] is rack middleware that displays a speed badge for every HTML page. Well, the first issue is I was inspecting the JSON API Rails app. But, it still had a HTML 404 page, yay!
So, we can hit our API locally with some requests, RMP will do it's thing, and then I can manually load the 404 HTML page in a browser, which shows the previous requests via the RMP badge.
Similar to what Skylight was showing me, I can clearly see a slow request stand out.
I wanted to add some query string parameters to customize my mini-profiler usage. I Loaded up Insomnia for making HTTP requests to my local API server, hitting the endpoint that has been identified in the above steps.
Following this blog article by Nate Berkopec, I figured the Garbage Collection feature was what I was after and added the
?pp=profile-gc flag to my POST request.
This Garbage Collection flag would track object allocations. So, I needed a baseline. I enabled the default Rails homepage and hit that:
New objects: 3962
Then, I made a request against the slow endpoint:
New objects: 161271
But, still I have not yet been able to narrow down a chunk of code that is causing this excessive object allocation. I have only confirmed it in more detail.
I then tried the
profile-memory feature of RMP.
Total allocated: 14772397 bytes (162495 objects) Total retained: 1647863 bytes (12389 objects) allocated memory by gem ----------------------------------- 5765128 aws-sdk-v1-1.64.0 3202115 activesupport-4.1.5 2962795 activerecord-4.1.5 1764149 2.1.3/lib ...
After double-checking the AWS library for performance issues or a new version with improvements. I moved down into the ActiveSupport and ActiveRecord libraries. These are part of Rails and are related to accessing your database. But, we knew at this point, that SQL wasn't an issue. The lead me to conclude the issue was something on the lines of
Too much memory is being used when accessing data from the PostgreSQL database, causing the Ruby garbage collector to jump in, stressing the CPU and crashing the app.
I eventually identified an Active Record method being used in a loop, serialize. This view had quite a lot of records, and had nested Model records, and the way in which it was being serialized caused all the records' attributes to be initialized in memory.
Unfortunately, I don't have access to the codebase anymore, and it's been ~6 months. But, from what I can recall, I ended up caching references for only the attributes that were needed by the front-end, instead of serializing all attributes of the records.
I believe I made symbols for the attributes I wanted (which are only initialized once in Ruby, if I understand correctly).
Then I reduced the amount of nested/related model data being serialized, again by reducing it to only what the front-end needed. As well as caching nested references. There were parent model instances that had arrays of children. Many of those children were shared across the parents. The original ActiveRecord code was re-initializing every single child instance, instead of re-using a potentially already initialized instance.
In the end, this was enough to significantly reduce memory allocations, preventing the Ruby garbage collector from eating CPU, resulting in manageable response times.
Anyway, that's the path I took. I had very little previous experience with Ruby/Rails prior to this. But, with the reporting that comes with Heroku and their PostgreSQL hosting, as well as Skylight and the rather awesome Rack Mini Profiler, the task was much less daunting.
This isn't my area of expertize, but I certainly enjoy it. So if anyone has tips or advice, let me know!