plano/_sources/observability/stats.rst

9 lines
429 B
ReStructuredText

.. _monitoring:
Monitoring
==========
Arch offers several monitoring metrics that help you understand three critical aspects of your application:
latency, token usage, and error rates by an upstream LLM provider. Latency measures the speed at which your
application is responding to users, which includes metrics like time to first token (TFT), time per output
token (TOT) metrics, and the total latency as perceived by users.