Having trouble with the agent? Take a look here.
Before proceeding, make sure that you are running the latest version of the Skylight agent.
If you’re still having trouble after following this guide, please drop us a line.
Run bundle exec skylight doctor
in your production terminal to check for common issues with the Skylight agent. You should see something like this as the output:
Checking SSL OK Checking for Rails Rails application detected Checking for native agent Native agent installed Checking for valid configuration Configuration is valid Checking Skylight startup Successfully started Waiting for daemon... Success
Pro Tip:
On Heroku you can access your production terminal with heroku run bash
.
Common issues reported by skylight doctor
include:
The skylight setup
command automatically generates a valid config/skylight.yml
file.
If you have edited this file, check that you haven’t introduced any syntax errors or invalid configuration.
If you have removed this file, you will need to set the equivalent environment variables in any environments where you are running Skylight.
See also our documentation on setting your authentication token.
If you are still having trouble, please run Skylight.instrumenter.config
in your production terminal, and contact us with the results.
By default, Skylight uses your Rails tmp
path as the sockfile directory. This directory must be writeable and cannot be located on an NFS mount (e.g. Vagrant).
Set SKYLIGHT_SOCKDIR_PATH
(in your env) or daemon.sockdir_path
(in your config) to a writeable, non-NFS path, like /tmp
.
By default, Skylight writes logs to log/skylight.log
. In the event that this path is not writeable, set log_file
in your config to a writeable path.
Verify that the Skylight gem is in the right group and that you have run bundle install
.
If you’re using Sinatra or Grape, make sure you followed the installation instructions
Check the Skylight agent server requirements to make sure your platform is supported.
To avoid taking your production application down due to an installation failure, Skylight does not raise an exception when it can’t install the native agent. If you’re running a compatible OS and still see this warning, it’s possible that when the gem was installed, the native extension libraries failed to download from S3. If this is the case, re-installing the gem with bundle pristine skylight
may fix the issue.
If you still see errors after re-installing, try running your application with SKYLIGHT_REQUIRED=true
. This will cause Skylight to raise an exception when the native agent is missing. This exception may be useful in troubleshooting the problem. If you need help, send the exception message and backtrace to us at support@skylight.io.
Looks like your SSL certificates are out of date. The skylight doctor
command will provide further instructions. By default, we try to use your local SSL root certificates, but in the event those are out of date, you can force skylight setup
to use Skylight’s bundled root certificates by running:
SKYLIGHT_FORCE_OWN_CERTS=1 bundle exec skylight setup <setup token>
Likely this is due a temporary network issue. Please try again.
If on subsequent retries, you are still having issues, try running the following command to verify that you can connect to our authorization server:
curl -v https://auth.skylight.io/status
If curl
fails, you may need to allow access to *.skylight.io
on port 443 (e.g. if you are behind a firewall).
Finally, check our Status Page for the unlikely event that we are experiencing infrastructure issues.
If—after updating the agent and fixing issues identified by Skylight Doctor—your data is still not reporting, consider the following questions:
Make sure you are using the latest version of the Skylight gem. In the directory for your Rails app, run this command:
bundle list | grep skylight
You should see something like the following output:
* skylight (version number) * skylight-core (version number)
Find the latest version on RubyGems.
Sometimes people forget to deploy their change. Oops! Don’t worry, we won’t tell anyone.
Make sure that there is traffic to your application. If your Rails app is handling requests, you should start to see data in Skylight in just a few minutes.
Verify that the application is running with the correct Rails environment. By default, the agent only starts in the production
environment, but this can be configured.
To learn how to change what environments Skylight starts in, see Environments.
One surprisingly common mistake people make is disabling Skylight in their production environment:
# config/application.rb config.skylight.environments = ["staging"] # DO NOT DO THIS config.skylight.environments += ["staging"] # DO THIS
skylight
gem in the right group?Verify that, in your app’s Gemfile
, you’ve added the skylight
gem to a group that will be installed in production. For example, if you add skylight
to the development
group, it will not run when you deploy to production.
To make sure Skylight is activated, you may need to restart your Unicorn masters.
IMPORTANT:
If you notice that the pre-fork (master) has not restarted on a typical deploy, make sure that you run a true restart (stop/start), rather than a reload.
You might see the “No data within this time range” message if the agent running in your Rails app stops reporting performance data to Skylight. If you were able to see data before but it has stopped working recently, restarting your server will usually fix the issue.
If the agent encounters multiple errors in a short span of time, it will shut itself down. This is done out of an abundance of caution to ensure that a potential bug in the agent doesn’t bring down your app in production.
We are working to add more logging to the agent so we can better diagnose what causes the agent to shutdown and recover gracefully in the event of an error. If you find this is happening regularly, please let us know!
Please check your logs for Skylight errors.
This is due to our use of two different compression algorithms for the data. In some rare cases, especially when there were only one or two requests in a range, there’s a slight mismatch between the algorithms. We hope to resolve this in future iterations.
When the Skylight instrumenter starts, it attempts to determine what sort of process is running (either a web server or a background job processor). As there is no standard interface for doing so, it relies on a number of known hints. If your app uses a more bespoke setup, Skylight may send data to the wrong component. This is more likely to happen if you have set a custom worker component name, in which case it can be solved by setting SKYLIGHT_COMPONENT=web
in the environment that runs your web server.
Similarly, if you have background job data reported to your web server, setting SKYLIGHT_COMPONENT=worker
when running your background jobs should tell Skylight to direct these traces to your worker component.
Even if this fixes your issue, please do email support@skylight.io and let us know exactly what commands you use to start your server or background jobs processors— we’d like automatically handle as much of these as possible.
The Skylight agent will log errors to log/skylight.log
or on Heroku in STDOUT (Look for lines starting with [SKYLIGHT]
).
[E0001] Spans were closed out of order
and invalid span nesting
This error indicates that a parent span (event sequence item) was closed before all of its children were.
One common cause of this issue is a Middleware that doesn’t conform to the Rack SPEC. Specifically, “If the body is replaced by a middleware after action, the original body must be closed first, if it responds to close
.” If you are unable to fix the Middleware, you can remove the Middleware probe with config.skylight.probes -= ['middleware']
in your Rails config (note that you will no longer see individual Middleware in your endpoints list or endpoint event sequences; all Middleware will show up as “Rack” instead).
[E0002] You've exceeded the number of unique span descriptions per-request.
and A payload description produced <too many uniques>
Skylight limits the number of unique span (event sequence item) descriptions in a request to 100 to prevent misbehaving apps from causing trouble. In most cases, this will be fine as even items such as similar repeated queries will be normalized into a single description (e.g. SELECT * FROM users WHERE id = 1
and SELECT * FROM users WHERE id = 2
both become SELECT * FROM users WHERE id = ?
). However, if your request is extremely complex it is possible to exceed this limit. To resolve this issue, reduce the number of uniquely named items that you instrument.
[E0003]
Exceeded maximum number of spansSkylight limits the maximum number of spans (event sequent items) in a request to 2,048 to prevent a data overload. If your request is very complex, you may hit this limit and data will no longer be tracked for that request. A couple common causes include:
[E0004] Failed to extract binds from SQL query.
and Failed to lex SQL query
These errors indicate that the agent is unable to parse a SQL query in your application. There errors won’t prevent your application from operating, though they will reduce the information that we can display in the UI for these queries. They will be displayed in the Event Sequence as simply “SQL”, not the full sanitized query.
Generally, the reason you will see this error is because you’re using a syntax we do not recognize (often a more complex or non-standard syntax). We’ve optimized for the most common syntax constructions and plan to support more in the future.
When running across this error, please report it so we can learn what queries are important to our customers. If this error becomes too noisy, you can disable it by setting log_sql_parse_errors: false
in your config. Alternately, you may selectively disable Skylight for certain noisy spans using Skylight.mute.
ERROR:skylight::cli: skylightd exiting abnormally; err=DaemonLockFailed
As per this issue in the agent repo, the fix for this is relatively quick. Just explicitly set daemon.sockdir_path
in your config to a writeable, non-NFS path.
First, try syncing your GitHub account with Skylight. If that doesn’t work, it’s possible the GitHub organization that owns the repo needs to grant permissions in order for us to see that you are a member of the repo. Ask your GitHub org’s administrator to add Skylight to the approved access list, then sync your GitHub account again. You should then have access to the app(s) you expected to see!
If you don’t see your organization in the dropdown when trying to connect an app or your users encounter issues accessing your GitHub-connected apps, you may have your GitHub organization’s “Third-party application access policy” set to “Access restricted.”
You can check our current permissions by visiting the “Third-party access” settings for your organization. If the permissions are correct, Skylight will be marked as “Approved”:
Your organization must allow third-party access to Skylight so that we can use the GitHub API to access any information about your org. Click the link below the “Connect Your GitHub Repository” dropdown on the app settings page to update your GitHub permissions. This link takes you to a page in GitHub, where you can click the “Grant” button to give Skylight access to the necessary information:
There are a few reasons why this might happen.
Sorry, right now we only support repos that are connected to an organization, though we may allow use of personal repos in the future. Shoot us an email at support@skylight.io to let us know if this feature is important to you!
This is actually a known bug, but it’s quite rare these days and we’ve had trouble reproducing it, so have had trouble fixing it. Please do email support@skylight.io and let us know exactly what steps you took when signing up, especially if there were any errors or issues along the way, or if you refreshed the invitation signup page at all before clicking “sign up with GitHub.” Any information about anything out of the ordinary helps!
In the meantime, you can ask the person who initially invited you to send another invitation to the email address associated with your account. This will add you to the app automatically and you should have no further issues.
Memory profiler gems like memory_profiler
and derailed
generally use Ruby’s ObjectSpace
—a really nifty way to get information about all the objects allocated in your Ruby application that is great for troubleshooting memory issues.
Digging into the source of both derailed
and memory_profiler
(which is used by derailed), we discovered the following line (source):
file = ObjectSpace.allocation_sourcefile(obj)
This line looks at an object and asks ObjectSpace
who caused it to be allocated. As it turns out, objects allocated at require time are attributed to the file that calls the original Kernel#require
method.
Now, this is all well and good except for one small problem: Skylight overwrites Kernel#require
in order to properly install probes. This means that it is Skylight that calls the original Kernel#require
, not whatever other gem is really doing the require. So, as soon as Skylight is loaded, every future require
call is attributed to Skylight.
Skylight isn’t the only library to do this. ActiveSupport has a similar hook in it’s Loadable
module that is included into Object
. Without Skylight, it would be ActiveSupport taking the blame. However, because this hook uses super
, it calls to the superclass Kernel
that is now the version that Skylight created.
Even RubyGems itself gets in on the overwriting behavior. Though, if you use Bundler (as everyone is), this is actually reversed.
So there you have it, run derailed
without Bundler and you’ll see RubyGems blamed for your requires. Add in Bundler and the correct files will get blamed. Bring in ActiveSupport and then ActiveSupport will be blamed. Include Skylight and blame will shift to us. Good times!
So what’s the moral here? Know your tools. ObjectSpace
and the libraries that use it do some really useful things. They aren’t perfect, but they’re open source. Read the code and find out a bit about how they work and you’ll be able to put them to better use!
Skylight performs setup when the gem is required, at which time Rails will be detected and tapped into. However, you may find you have to manually require skylight/railtie
if you need to load the Skylight gem before Rails.
Unfortunately, it’s possible that your app has been the target of some malicious bots or abusive requests. Even if you see very few page views, that is not an accurate indicator of how many requests your app is receiving behind the scenes. An overzealous health checker is another possible source, if you are using such a service.
We strongly recommend that you look into the source of the abusive requests so they don’t continue or return. Many Skylight users have reported success using something like Rack Attack to deal with these requests. You can also use a library like IPCat with Rack Attack to access a list of known IPs for “datacenters, co-location centers, shared and virtual webhosting providers. In other words, ip addresses that end web consumers should not be using.”
If you want to avoid seeing these requests in Skylight, we recommend placing Rack::Attack at the top of your middleware stack (good practice even without Skylight) and then running Skylight immediately afterward.
If you find yourself in this position, here are some tips (from an actual Skylight customer!) on how you might deal with finding and blocking the abusive requests: