Node.js - Runtime Performance Troubleshooting

  • Updated

Prerequisite

 

Objective

Running the Contrast Security agent along with a Node.js application (and any other technology actually) will always provide a little bit of an overhead resulting in increased Memory and CPU usage and request latency. As we know this is inevitable because we are running additional code - the agent’s code along with the code of the given application. We at Contrast Security are constantly working to minimize this footprint of our agent and we’ve gone to great lengths in this task, especially in the last year. Nevertheless, the increase in resource usage can prove to be troublesome for some testing and/or production environments, so here are steps to gather performance logs for Contrast Support.

 

Table of Contents

  1. Identify the Troublesome "Phase"
  2. Latency
    1. Route-metrics Logs
    2. Performance Logs
  3. Resources (CPU/Memory)
    1. Flame Graphs
    2. Memory Heap Dumps

Process

Identify the Troublesome "Phase"

We can divide the application life into two phases - application start-up and handling requests.
At the start of each new application, the agent is performing a rewriting of some of the code provided so it can be instrumented. And during the handling requests phase, the agent is tracking each request made. There is a possibility of performance issues in both of the phases. Here, we will talk only about performance issues in the second phase - when your application is handling requests. First, you can identify at what “phase” of your application your issue occurs. If it is in the start-up phase, please visit Node.js - Startup Troubleshooting. If it is in the handling requests phase, continue reading below.

 

Our default configuration is made such as to be suitable for a common type of customer and if you have a more specific case it might need a little tweaking. To do this we suggest relying on the experience of our support team and providing us with collected performance data that we will use to tailor a configuration that will answer your needs. If you are using some process monitoring tool - be sure to send us the logs from it. Other than that here are a few steps that you can take that will help us solve the issue you are facing:

 

Latency

If your performance issues are related to latency, please gather the following route-metrics logs and performance logs and open a ticket with customer support for help.

Route-metrics Logs:

We’ve developed a free and simple to use library that can collect data for the times each request spend being handled in our application and group it by route.

  1. Install the library to your project with npm install @contrast/route-metrics

  2. Start your application, but requiring @contrast/route-metrics first. Make sure you require the @contrast/route-metrics library before any other. If, for example, your usual application start script with the agent is node -r @contrast/agent index.js then to start it with @contrast/route-metrics you should change it to:
    node -r @contrast/route-metrics -r @contrast/agent index.js
  3. If no other configuration to @contrast/route-metrics is done a file named route-metrics.log should appear in your application’s working directory.

  4. Run your test suite or exercise the routes that you are having an issue with.

  5. Data about the requests routes, methods, and latency will be written to the route-metrics.log.

    Rename or move the file into another directory and repeat steps 2-4 but this time run your application without the agent so we can have a baseline of your routes performance
  6. Send the two log files to our team for analysis.

As you can see the library can be used with and without the agent so you can use it to collect data for yourself too. The way you can turn the log file into more readable data is by executing the following command in the working directory - npx -p @contrast/route-metrics -no log-processor route-metrics.log. The default output is in csv format. If you are interested you can check the additional settings of the library on its npm page - npm: @contrast/route-metrics.

 

Performance Logs:

Performance logs are only available with v4 Node agent. By default, the Contrast v5 Node agent has metric logging enabled under DEBUG level.

Node.js agent has a built in logger for request performance. To enable this logging you should:

  1. Enable DEBUG level of logging for the agent. This can be done either by changing the agent.logger.level to DEBUG in the agent’s configuration .yaml file or by setting an environment variable CONTRAST__AGENT__LOGGER__LEVEL=DEBUG.

  2. Enable request performance logging for the agent. This can be done either by changing the agent.node.req_perf_logging to true in the agent’s configuration .yaml file or by setting an environment variable CONTRAST__AGENT__NODE__REQ_PERF_LOGGING=true.

  3. Run your application with the agent and run your test suite or just exercise the routes that you are having issues with.

  4. Send the agent log from this run to our team for analysis.

Example YAML:

agent:
logger:
level: DEBUG
node:
req_perf_logging: true

If you check this log file you can find that for each request there is a line in the log like this one:

debug: [timestamp] contrast:req-perf-logger - {"path":"xxx","method":"xxx", 2"instrumentationTime":xxx,"totalTime":xxx,"events":[...]}

Here we can see the path for the request, its method, what time the agent instrumentation took, and what was the total time spent in handling the request. The events property will have a collection of functions that are instrumented by the agent and executed for handling the request. For each function we will see:

{ 
"name": "xxx",
"count": xxx,
"times": { "pre": xxx, "post": xxx, "orig": xxx }
}

This is the name of the function, the number of times it was called, the sum of time for the instrumentation done by the agent before (pre) and after (post) each function execution, and finally the time spent in the original function itself (orig). All times are in nanoseconds.


 

Resources (CPU/Memory)

If your performance issues are related to resources (high CPU usage or out of memory), please gather the following v-8 isolate logs and memory heap dump and open a ticket with customer support for help.

Flame Graphs:

Node.js has a built in profiler that you can use to create a log file with low level CPU data generated by the Node process. This log can be used to create a flame graph. When looking at the flame graph we will see how much processor time each function took, also its stack trace. This helps to indent the “weight” of each process/function in terms of CPU usage. To do this you should:

  1. Make sure the agent log level is set to ERROR. This can be done either by changing the agent.logger.level to ERROR in the agent’s configuration .yaml file or by setting an environment variable CONTRAST__AGENT__LOGGER__LEVEL=ERROR

  2. Start your application with the agent and the --prof flag, for example:
    node --prof -r @contrast/agent index.js
  3. Exercise the routes that you are having issues with and then quit the application.

    It’s important to go through the parts/routes that have problems so the eventual culprits for the bad performance can be identified. It’s not advisable to run long test suites for this log, because it will create huge files.
  4. Node.js will store the collected data in a file named isolate-xxxxxxxx-v8.log Repeat the first two steps but this time run your application without the agent like:

    node --prof index.js
  5. Send the two isolate-xxxxxxxx-v8.log files to our team for analysis

If you want to create the flame graph yourself you can accomplish this by using a tool called “flamebearer” available from NPM. GitHub - mapbox/flamebearer: Blazing fast flame graph tool for V8 and Node 🔥

 

Memory Heap Dumps:

Our agent has a built in library to generate memory heap dumps that can be analysed. To create them you should:

  1. Make sure the agent log level is set to ERROR. This can be done either by changing the agent.logger.level to ERROR in the agent’s configuration .yaml file or by setting an environment variable CONTRAST__AGENT__LOGGER__LEVEL=ERROR

  2. Enable heap dump generation for the agent. This can be done either by changing the agent.heap_dump.enable to true in the agent’s configuration .yaml file or by setting an environment variable CONTRAST__AGENT__HEAP_DUMP__ENABLE=true

  3. Exercise the routes that you are having issues with and then quit the application.

    It’s important to go through the parts/routes that have problems so the eventual culprits for the bad performance can be identified. It’s not advisable to run long test suites for these logs, because it will create huge files.
  4. By default, the agent will create 5 heap dump files in a folder named contrast_heap_dumps under the root directory for the app, each with a name ending with -contrast.heapsnapshot. These dumps will have data about the memory usage for the first two minutes of your application. If you think that the issues are occurring at a later stage you can contact our team to provide you with correct settings so we can observe the period in time that we are interested in. If you want to do that on your own you can play around with these agent properties:

    1. agent.heap_dump.delay_ms - sets an initial delay - from the time the agent starts to the time we start to create heap dumps. The time is in ms and the default here is 10000.

    2. agent.heap_dump.window_ms - sets the window between the creation of two heap dumps in ms. The default is 30000.

    3. agent.heap_dump.count - sets the number of heap dumps we want to create. After this number is reached the feature is disabled. The default is 5.

  5. Keep in mind that depending on your application the heap dump size may vary and sometimes be greater than 1 GB. After creating the necessary heap dumps send them to our team for analysis

Example YAML:

agent:
logger:
level: ERROR
heap_dump:
enable: true

 

Conclusion

The more data we have, the easier and faster our investigation of your issues will be. Even if you are not able to provide our team with all of the aforementioned logs and files, you can send some of them or contact our team to talk through your problems or concerns about gathering and providing this data to us.

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request