mirror of
https://github.com/katanemo/plano.git
synced 2026-04-29 19:06:34 +02:00
* updated docs (again) * updated the LLMs section, prompt processing section and the RAG section of the docs --------- Co-authored-by: Salman Paracha <salmanparacha@MacBook-Pro-261.local>
7 lines
No EOL
414 B
ReStructuredText
7 lines
No EOL
414 B
ReStructuredText
Monitoring
|
|
==========
|
|
|
|
Arch offers several monitoring metrics that help you understand three critical aspects of your application:
|
|
latency, token usage, and error rates by an upstream LLM provider. Latency measures the speed at which your
|
|
application is responding to users, which includes metrics like time to first token (TFT), time per output
|
|
token (TOT) metrics, and the total latency as perceived by users. |