-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about performance testing #372
Comments
maybe, can you try to increase |
Thank you for your response! I seem to have discovered some issues: Initially, my server was configured for short connections (with keepalive_timeout set to 0), and under those conditions, Pingora did not perform well. Later, I tested the server with long connections, and Pingora demonstrated its advantages. I also tested the configuration changes as you suggested. The detailed results are as follows: Nginx test results are as follows:
cpu: 49% The previous Pingora test results are as follows:
cpu: 80%, In each test, the memory still increases irreversibly. The improved Pingora test results are as follows:
In summary, thanks for the response; it has resolved some of my issues. However, the memory increase and other problems still persist. I will continue to monitor this. |
Hey! I've been trying to debug ever-increasing memory utilization in our Pingora proxy service (HTTP proxy with TLS and h2), which is described in this issue, and similarly in this: #447, which indicates other Pingora users have similar problems. I can easily reproduce the issue with k6 load tests, and we can see that at the start of the test memory utilization increases quickly. Then, hours after the test, the memory utilization remains high. It keeps growing indefinitely until the service goes OOM, or until we restart it. In the below image you can see the load test run for 5 minutes at ̃20:00. This is on an AWS ECS Fargate service, with 0.5 vCPU and 1GB memory. First I tried to see if we had written any memory leaks in our code, but if we do, I haven't been able to find it. I've tried using valgrind memcheck with leak detection, as well as valgrind massif for heap profiling. Then I tried to figure out if there was some connection pool in Pingora that was ever-growing. The service is behind an AWS network load balancer, and we can see in its metrics that the downstream connections are not held open, so I don't believe that is the cause. I tried to disable the upstream connection pool as instructed here: https://github.com/cloudflare/pingora/blob/main/docs/user_guide/pooling.md, but the default size for that pool is 128, so it doesn't make sense that it would be ever-growing and enough to drive the service OOM. And after disabling it and re-running the test, it did not resolve the issue of ever-growing memory. To summarize, I realize this is most likely an error on our end, since I know you run Pingora in production yourselves, and I assume you don't have this problem. However, perhaps you have seen this behavior before? Do you have any recommendations for what config I might tweak to resolve it? Any advice is highly appreciated, but I fully understand if you don't have time to help me with this. I'll tag you for visibility @drcaramelsyrup, apologies in advance! If you have time to take a look, here is our setup code: pub fn start() {
std::thread::spawn(|| {
// don't drop the rt
let rt = tokio::runtime::Runtime::new().unwrap();
rt.block_on(async move {
setup_tracing(tracing_subscriber::registry());
info!("started tracing subscriber with otel exporter in tokio rt");
// keep this runtime running, so that the otel exporter keeps running
std::future::pending::<()>().await;
});
});
let args = Args::parse();
info!(args = ?args, "Starting proxy");
let mut server = Server::new(None).unwrap();
server.bootstrap();
// Attach the Prometheus service
let mut prometheus_service = Service::prometheus_http_service();
let prometheus_address = format!("0.0.0.0:{}", args.metrics_port);
info!("Serving prometheus metrics on address {prometheus_address}");
prometheus_service.add_tcp(&prometheus_address);
server.add_service(prometheus_service);
// XXX: is it fine to just have a runtime like that ?
// It might mess up with the autoreload feature of Pingora, but I don't think we're going
// to use that.
let rt = tokio::runtime::Runtime::new().unwrap();
let aws_config = rt.block_on(
aws_config::defaults(BehaviorVersion::latest())
.timeout_config(
TimeoutConfig::builder()
// Increase the connection timeout
// See https://github.com/awslabs/aws-sdk-rust/issues/871#issuecomment-1690842996
.connect_timeout(Duration::from_secs(10))
.build(),
)
.load(),
);
let conn_opts = PgConnectOptions::new()
.host(&args.db.host)
.port(args.db.port)
.username(&args.db.user)
.password(&args.db.password)
.database(&args.db.name);
let pool = rt
.block_on(sqlx::Pool::connect_with(conn_opts))
.expect("connect to postgres");
let db = Database::new(pool);
// HTTPS server
let tls_resolver = Box::new(TlsResolver::new(
db.clone(),
args.wildcard_fqdn,
Arc::new(ChainAndPrivateKey::new(
args.cert.wildcard_cert_full_chain,
args.cert.wildcard_cert_private_key,
)),
Duration::from_secs(args.certificates_ttl_seconds),
));
let host_resolver = CachingHostResolver::new(
CloudmapHostResolver::new(ServiceDiscoveryClient::new(&aws_config), db),
Duration::from_secs(args.resolver_ttl_seconds),
);
let mut proxy = pingora::prelude::http_proxy_service(
&server.configuration,
EcsProxy::new(args.user_app_port, host_resolver, args.max_rps),
);
let proxy_address = format!("0.0.0.0:{}", args.proxy_port);
info!("Running proxy with TLS on address {proxy_address}");
let mut tls_settings = TlsSettings::with_callbacks(tls_resolver).unwrap();
tls_settings.enable_h2();
proxy.add_tls_with_settings(&proxy_address, None, tls_settings);
server.add_service(proxy);
server.run_forever();
} |
Try using tikv-jemallocator. It helped me reduce memory usage growth in cases involving a large number of new upstream connections. I think this improvement is related to reduced memory fragmentation. |
@SsuiyueL @oddgrd @ermakov-oleg @github2023spring I keep doing different benchmarks and it seems that the best for Pingora is when it does not use all cores "0 threads“ config setup. Server with 8 cores amd64, AlmaLinux 9, disk cache I use https://loader.io/ to test, and webserver Pingap based on Pingora which creates @vicanso |
Hi @SsuiyueL, how did you added use async_trait::async_trait;
use pingora::{
prelude::{HttpPeer, Opt},
server::Server,
Error,
};
use pingora_http::ResponseHeader;
use pingora_proxy::{ProxyHttp, Session};
pub struct MyProxy {}
pub struct MyCtx {}
#[async_trait]
impl ProxyHttp for MyProxy {
type CTX = MyCtx;
fn new_ctx(&self) -> Self::CTX {
MyCtx {}
}
async fn request_filter(
&self,
session: &mut Session,
_ctx: &mut Self::CTX,
) -> Result<bool, Box<Error>> {
session.set_keepalive(Some(75));
let header = ResponseHeader::build(200, None).unwrap();
session
.write_response_header(Box::new(header), true)
.await?;
session
.write_response_body(Some(bytes::Bytes::from_static(b"Hello Dakia!")), true)
.await?;
Ok(true)
}
async fn upstream_peer(
&self,
_session: &mut Session,
_ctx: &mut Self::CTX,
) -> Result<Box<HttpPeer>, Box<Error>> {
let addr = ("127.0.0.1", 3000);
let peer = Box::new(HttpPeer::new(addr, false, "one.one.one.one".to_string()));
Ok(peer)
}
}
fn main() {
let mut opt = Opt::parse_args();
opt.conf = Some(
"./pingora.conf.yaml".to_string(),
);
// Read command line arguments
let mut my_server = Server::new(opt).unwrap();
my_server.bootstrap();
let mut my_proxy: pingora::services::listening::Service<pingora_proxy::HttpProxy<MyProxy>> =
pingora_proxy::http_proxy_service(&my_server.configuration, MyProxy {});
my_proxy.add_tcp("0.0.0.0:8080");
my_server.add_service(my_proxy);
println!("Started Server on port 8080");
my_server.run_forever();
} Config threads: 8
daemon: false
upstream_keepalive_pool_size: 1000 |
Hello,
I encountered some issues while conducting performance testing. I reviewed previous issues, but they did not resolve my problem. Could you please help me with a detailed explanation? I would greatly appreciate it.
I have implemented a simple HTTP proxy using Nginx (OpenResty and Nginx-Rust) and Pingora. Below is the code I have implemented based on the example [modify_response]:
config:
My testing was conducted on an Ubuntu system with 8 cores and 16 GB of MEM. Nginx started 8 worker processes.
1. Using wrk for testing:
wrk -t10 -c1000 -d30s http://172.24.1.2:6191
The result of Nginx:
The total CPU usage is around 50%, and the memory usage of each worker can be ignored.
The result of Pingora:
The total CPU usage is around 70%, and the memory usage increases by 0.3% after each test (0->0.9->1.2).
Q1: In terms of throughput, Nginx performs slightly better than Pingora, while Pingora shows slightly lower latency compared to Nginx. (Isn't that a bit strange?) Overall, the overall conclusion is that the differences between the two are not significant. Does this align with your expectations?
Q2: In terms of CPU usage, the overhead of Pingora is significantly greater than that of Nginx. Is this in line with your expectations? Regarding memory, I’ve noticed that memory usage increases after each test and does not recover. Could this indicate a memory leak?
2. Using ab for testing:
ab -n 10000 -c 100 http://172.24.1.2:6191/
When I perform testing with ab, Pingora times out:
The packet capture analysis is as follows:
It can be seen that a GET request was sent at the beginning, but Pingora did not return a response.
Nginx can be tested normally using the same command, and the packet capture shows that it responded properly.
ab is using HTTP/1.0, but after verification, this is not the cause of the problem.
Additionally, I also used Siege for testing, and the results were similar to those obtained with wrk.
3. Summary
Pingora is a remarkable project, and I’m very interested in its potential improvements over Nginx. However, I would like to know:
Am I missing any configurations, or how can I improve it to enhance performance and reduce CPU and memory usage?
Is it unfair to compare Pingora with Nginx in this simple scenario? In other words, is Pingora's advantage more apparent in more complex scenarios? (If so, I will use Pingora in more complex scenarios.)
I really appreciate your support.
The text was updated successfully, but these errors were encountered: