Ingester's OOM killed when average trace size grows despite rate limiting and discarding live traces. #4424

adhinneupane · 2024-12-06T16:59:53Z

Describe the bug
Despite setting low rate limits and use max_traces_per_user, our ingesters get OOM killed when trace size grows above 100KiB.

To Reproduce
Steps to reproduce the behavior:

Start Tempo (2.5.0) in a k8s cluster with 3 Ingesters at (10 GiB) memory limits each.
Start xk6-client-tracing with average trace size set to 100KiB. (see param.js below)
Run load test with ~3 to 5k active live traces.

Expected behavior
Ingester's get OOM killed and restart.

Environment:

Infrastructure: [Kubernetes]
Deployment tool: [tanka]
Tempo version: 2.5.0
No of Distributor to Ingester: 3 :: 3

Additional Context

We do not face this problem when average trace size (p95) is below 50 KiB; Whenever average trace size exceeds ~90 KiB, we cannot prevent OOM kills despite setting a low burst_size_bytes, rate_limit_bytes and max_traces_per_user

OOM Kills	burst_size_bytes	rate_limit_bytes	Average Trace Size (Bytes)	Live Traces (30k)	Distributor bytes limit (burst + rate)	Distributor (N) x Ingester (N)	Ingester Memory (Max)	Rate Limit Strategy	Time Under Test	Average Trace Size * Live Traces (MiB)
0	17 MiB	14 MiB	57000	15000	29MiB	3 x 3	80%	Global	25m	815.3915405
0	17 MiB	14 MiB	48000	18000	29 MiB	3 x 3	70%	Global	25m	823.9746094
0	17 MiB	14 MiB	38000	25000	28 MiB	3 x 3	60%	Global	25m	905.9906006
1	17 MiB	14 MiB	187000	2000	18 MiB	3 x 3	N/A	Global	< 10m	356.6741943
1	17 MiB	14 MiB	219000	1200	18.9 MiB	3 x 3	N/A	Global	< 10m	250.6256104

param.js

import { sleep } from 'k6';
import tracing from 'k6/x/tracing';

export const options = {
    vus: 120,
    stages: [
    { duration: '2m', target: 120 },
    { duration: '10s', target: 120 },
    { duration: '2m', target: 120 },
    { duration: '10s', target: 120 },
    { duration: '2m', target: 120 },
    { duration: '10s', target: 120 },
    { duration: '2m', target: 120 },
    { duration: '10s', target: 120 },
    { duration: '2m', target: 120 },
    ]
};

const endpoint = __ENV.ENDPOINT || "https://<>:443"
const client = new tracing.Client({
    endpoint,
    exporter: tracing.EXPORTER_OTLP,
    tls: {
      insecure: true,
    }
});

export default function () {
    let pushSizeTraces = 50;
    let pushSizeSpans = 0;
    let t = [];
    for (let i = 0; i < pushSizeTraces; i++) {
        let c = 100
        pushSizeSpans += c;
        t.push({
            random_service_name: false,
            spans: {
                count: c,
                size: 900, 
                random_name: true,
                fixed_attrs: {
                    "test": "test",
                },
            }
        });
    }

    let gen = new tracing.ParameterizedGenerator(t)
    let traces = gen.traces()
    sleep(5)
    console.log(traces);
    client.push(traces);
}

export function teardown() {
    client.shutdown();
}

The text was updated successfully, but these errors were encountered:

joe-elliott · 2024-12-10T21:32:58Z

There are two things that drive memory usage in Tempo ingesters, compactors and (depending on the query) queriers:

Trace size
Dictionary sizes in parquet

I'm not surprised you're seeing elevated memory usage as you are bringing up the trace size, but I am very surprised you are seeing such elevated usage at just ~100-200KB. We run cells with tenants who push traces that are 50MBs+.

Some things to test:

This is likely creating a very large dictionary which is probably part of the memory issue. Let's try removing it.

random_name: true

Tempo 2.7 will have some nice ingester memory improvements and will also contain the metric tempo_ingester_live_trace_bytes which will help you see per tenant who is consuming live trace memory.
Another issue that we are looking now is that an ingester that is cpu starved will experience lock contention and go heap will balloon. This is harder to prove out, but it should be in back of mind while we are diagnosing this. A memory profile would be helpful to seeing if this is the issue. Honestly, a memory profile would be great all around and would help me very quickly diagnose the issue if you could provide one.
This metric will show us what Tempo thinks roughly the bytes per traces and would be useful to confirm what we believe the test is creating. We can show this metric per pod or per tenant to see if there's anything interesting.

sum(rate(tempo_ingester_bytes_received_total{}[1m])) / 
sum(rate(tempo_ingester_traces_created_total{}[1m]))

Tempo has the ability to restrict max trace size, but the values are so low in your tests I don't think that's useful here.

Thanks for the detailed test and write up. Hopefully we will be able to get to the bottom of these issues.

joe-elliott mentioned this issue Dec 11, 2024

Rightsizing Tempo Ingesters when trace sizes vary to prevent OOM kills #4412

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ingester's OOM killed when average trace size grows despite rate limiting and discarding live traces. #4424

Ingester's OOM killed when average trace size grows despite rate limiting and discarding live traces. #4424

adhinneupane commented Dec 6, 2024

joe-elliott commented Dec 10, 2024 •

edited

Loading

Ingester's OOM killed when average trace size grows despite rate limiting and discarding live traces. #4424

Ingester's OOM killed when average trace size grows despite rate limiting and discarding live traces. #4424

Comments

adhinneupane commented Dec 6, 2024

joe-elliott commented Dec 10, 2024 • edited Loading

joe-elliott commented Dec 10, 2024 •

edited

Loading