Improve config reload error reporting with severity-aware task logs#13090
Improve config reload error reporting with severity-aware task logs#13090brbzull0 wants to merge 5 commits intoapache:masterfrom
Conversation
Unify diagnostic logging (diags.log) with ConfigContext task logs (traffic_ctl config status) via CfgLoad* macros. Each task log entry now carries a DiagsLevel severity, enabling --min-level filtering in traffic_ctl and severity tags in output ([Note], [Err], etc). Reload summaries are logged to diags.log after a grace period, with detailed per-subtask dumps under the config.reload debug tag. Fixes: apache#12963
There was a problem hiding this comment.
Pull request overview
This PR unifies configuration reload diagnostics between diags.log and traffic_ctl config status by introducing severity-aware task logs and shared CfgLoad* logging macros, plus adding CLI filtering for displayed log severities.
Changes:
- Add
CfgLoad*macros to emit a single formatted message to both diags andConfigContexttask logs, including errata-aware failures. - Extend reload task logs to carry
DiagsLevelseverity end-to-end (task storage, YAML encode/decode,traffic_ctlrendering with tags, and--min-levelfiltering). - Add post-reload summary logging (and optional detailed tree dump under the
config.reloaddebug tag) after a grace period.
Reviewed changes
Copilot reviewed 43 out of 43 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/traffic_server/traffic_server.cc | Switch records reload reporting to CfgLoad* macros. |
| src/traffic_ctl/traffic_ctl.cc | Add config status --min-level CLI option. |
| src/traffic_ctl/jsonrpc/ctrl_yaml_codecs.h | Decode new structured log entries {level,text} with backward compatibility for string logs. |
| src/traffic_ctl/jsonrpc/CtrlRPCRequests.h | Add LogEntry{DiagsLevel,text} to reload response model. |
| src/traffic_ctl/CtrlPrinters.h | Track printer _min_level for filtering task logs. |
| src/traffic_ctl/CtrlPrinters.cc | Render severity tags and apply --min-level filtering in tree output. |
| src/traffic_ctl/CtrlCommands.cc | Parse --min-level and configure ConfigReloadPrinter. |
| src/records/unit_tests/test_ConfigReloadTask.cc | Update tests for structured log entries (entry.text). |
| src/proxy/logging/YamlLogConfig.cc | Route YAML logging config parse errors through reload context logging macros. |
| src/proxy/logging/LogConfig.cc | Convert logging reconfigure messages to unified reload/diags logging and adjust “missing file” handling. |
| src/proxy/http/remap/UrlRewrite.cc | Thread ConfigContext through remap load/build and emit unified reload log messages. |
| src/proxy/http/remap/RemapYamlConfig.cc | Add ConfigContext to YAML remap parsing and log parse failures via reload macros. |
| src/proxy/http/remap/RemapConfig.cc | Add ConfigContext to remap.config parsing and log parse/open errors via reload macros. |
| src/proxy/http/PreWarmConfig.cc | Minor reload completion message tweak. |
| src/proxy/ReverseProxy.cc | Convert remap reload to pass ConfigContext and use CfgLoad* macros. |
| src/proxy/ParentSelection.cc | Pass ConfigContext into parent table load and use unified logging. |
| src/proxy/IPAllow.cc | Convert ip_allow reload messaging to CfgLoad* macros, including errata-aware failures. |
| src/proxy/ControlMatcher.cc | Add ConfigContext plumb-through for table builds and log parsing errors through reload macros. |
| src/proxy/CacheControl.cc | Pass ConfigContext into cache control table load and use unified logging. |
| src/mgmt/config/ConfigReloadTrace.cc | Add reload summary logging and debug-dump of subtask tree; store severity with log entries. |
| src/mgmt/config/ConfigContext.cc | Add severity-aware ctx.log(level, ...) and ctx.log(errata); assign implicit severities on state messages. |
| src/iocore/net/quic/QUICConfig.cc | Plumb ConfigContext into QUIC config initialization and log SSL init failures into task logs. |
| src/iocore/net/SSLUtils.cc | Improve SSL multicert error detail (include cert name when available). |
| src/iocore/net/SSLSNIConfig.cc | Plumb ConfigContext through SNI reload and emit unified reload messages. |
| src/iocore/net/SSLConfig.cc | Plumb ConfigContext into SSL config and ticket key reload paths; emit severity-tagged task logs. |
| src/iocore/net/QUICMultiCertConfigLoader.cc | Emit QUIC multicert reload progress/fail/complete via CfgLoad* macros and log errata into task logs. |
| src/iocore/net/P_SSLConfig.h | Update SSL config APIs to accept optional ConfigContext. |
| src/iocore/dns/SplitDNS.cc | Convert SplitDNS reload messages to unified macros and pass ConfigContext into table load. |
| src/iocore/cache/P_CacheHosting.h | Plumb ConfigContext into cache hosting table build APIs. |
| src/iocore/cache/CacheHosting.cc | Convert cache hosting load logs to unified macros and accept ConfigContext for better reload visibility. |
| src/iocore/cache/Cache.cc | Pass reload ConfigContext into CacheHostTable build and emit unified completion message. |
| include/records/YAMLConfigReloadTaskEncoder.h | Encode reload logs as structured YAML entries with level and text. |
| include/proxy/http/remap/UrlRewrite.h | Update public signatures to accept optional ConfigContext. |
| include/proxy/http/remap/RemapYamlConfig.h | Add optional ConfigContext to YAML remap parsing interfaces. |
| include/proxy/http/remap/RemapConfig.h | Add optional ConfigContext to remap.config parsing interfaces. |
| include/proxy/ControlMatcher.h | Add optional ConfigContext to matcher build interfaces and constructor. |
| include/mgmt/config/ConfigReloadTrace.h | Change task log storage to LogEntry{level,text} and add summary/dump helpers. |
| include/mgmt/config/ConfigContextDiags.h | New header providing unified diags + reload-task logging macros (CfgLoad*). |
| include/mgmt/config/ConfigContext.h | Add severity-aware logging APIs and include diag-level definitions. |
| include/iocore/net/quic/QUICConfig.h | Update QUIC config initialization API to accept optional ConfigContext. |
| include/iocore/net/SSLSNIConfig.h | Update SNI config initialization APIs to accept optional ConfigContext. |
| doc/developer-guide/config-reload-framework.en.rst | Document unified macros, severity-aware task logs, filtering, and diags summaries/dumps. |
| doc/appendices/command-line/traffic_ctl.en.rst | Document --min-level and updated config status output examples. |
|
|
||
| .. code-block:: text | ||
|
|
||
| NOTE: Config reload [my-token] completed successfully: 3 tasks succeeded (3 total) |
There was a problem hiding this comment.
The documented success summary line here ("completed successfully: 3 tasks succeeded (3 total)") doesn’t match the current implementation in ConfigReloadTask::log_reload_summary() (which logs "completed: %d/%d tasks succeeded"). Please align the documentation with the actual output, or adjust the code to emit the documented message format.
| NOTE: Config reload [my-token] completed successfully: 3 tasks succeeded (3 total) | |
| NOTE: Config reload [my-token] completed: 3/3 tasks succeeded |
| ConfigReloadTask::log_reload_summary(State final_state) | ||
| { | ||
| if (!_info.main_task || !is_terminal(final_state) || _summary_logged) { | ||
| return; | ||
| } | ||
| _summary_logged = true; | ||
|
|
There was a problem hiding this comment.
ConfigReloadTask::log_reload_summary reads/writes _info (e.g., _info.main_task, _info.sub_tasks, _info.token) and _summary_logged without holding _mutex. Other methods in this class treat _info as protected by _mutex, and readers (e.g., status queries) can run concurrently, so this risks data races. Consider taking a lock (or copying via get_info() under lock) and making _summary_logged atomic or also guarded by the same lock.
| } | ||
|
|
||
| if (final_state == State::SUCCESS) { | ||
| Note("Config reload [%s] completed: %d/%d tasks succeeded", _info.token.c_str(), success_count, total); |
There was a problem hiding this comment.
The success summary log text here ("completed: %d/%d tasks succeeded") does not match the documented example in doc/developer-guide/config-reload-framework.en.rst ("completed successfully: 3 tasks succeeded (3 total)"). Either update the log format to match the docs/examples or adjust the documentation so operators can grep/recognize the exact message.
| Note("Config reload [%s] completed: %d/%d tasks succeeded", _info.token.c_str(), success_count, total); | |
| Note("Config reload [%s] completed successfully: %d tasks succeeded (%d total)", _info.token.c_str(), success_count, | |
| total); |
| /// Entries from log() carry a DiagsLevel; entries from state-change methods | ||
| /// (in_progress, complete, fail) use DL_Undefined (the task state itself | ||
| /// conveys severity). | ||
| struct LogEntry { | ||
| DiagsLevel level{DL_Undefined}; ///< DL_Undefined for state-change messages |
There was a problem hiding this comment.
This comment says state-change methods (in_progress/complete/fail) store log entries with DL_Undefined, but ConfigContext now records DL_Note for in_progress/complete messages and DL_Error for fail(reason) messages (see src/mgmt/config/ConfigContext.cc). Please update the comment to reflect the new severity behavior so it stays consistent with the implementation.
| /// Entries from log() carry a DiagsLevel; entries from state-change methods | |
| /// (in_progress, complete, fail) use DL_Undefined (the task state itself | |
| /// conveys severity). | |
| struct LogEntry { | |
| DiagsLevel level{DL_Undefined}; ///< DL_Undefined for state-change messages | |
| /// Entries from log() carry a DiagsLevel; state-change methods use | |
| /// DL_Note for in_progress/complete messages and DL_Error for fail() | |
| /// messages. | |
| struct LogEntry { | |
| DiagsLevel level{DL_Undefined}; ///< log() supplies the level; state changes use DL_Note or DL_Error |
| if (!min_level.empty()) { | ||
| static const std::unordered_map<std::string_view, DiagsLevel> level_map = { | ||
| {"debug", DL_Debug }, | ||
| {"note", DL_Note }, | ||
| {"warning", DL_Warning}, | ||
| {"error", DL_Error }, | ||
| }; | ||
| if (auto it = level_map.find(min_level); it != level_map.end()) { | ||
| _printer->as<ConfigReloadPrinter>()->set_min_level(it->second); | ||
| } else { | ||
| _printer->write_output("Invalid --min-level value. Use: debug, note, warning, error"); |
There was a problem hiding this comment.
--min-level parsing is currently case-sensitive (level_map.find(min_level)), but the CLI docs describe the values as case-insensitive. Consider normalizing the input (e.g., lowercase) before lookup (and possibly trimming) so --min-level Warning works as documented.
| Emergency("Can't initialize the SSL client, HTTPS in remap rules will not function"); | ||
| } else { | ||
| SSLError("Can't initialize the SSL client, HTTPS in remap rules will not function"); | ||
| CfgLoadLog(ctx, DL_Warning, "Can't initialize the SSL client, HTTPS in remap rules will not function"); |
There was a problem hiding this comment.
This logs the same message twice to diags: once via SSLError(...) (which already emits to the diagnostics system) and again via CfgLoadLog(...) (which also emits to diags via DiagsError). If the goal is to get the message into the reload task log, consider using ctx.log(DL_Warning, ...) (task-log-only) instead of CfgLoadLog, or otherwise avoid duplicating the diags entry.
| CfgLoadLog(ctx, DL_Warning, "Can't initialize the SSL client, HTTPS in remap rules will not function"); | |
| ctx.log(DL_Warning, "Can't initialize the SSL client, HTTPS in remap rules will not function"); |
Unify diagnostic logging (diags.log) with ConfigContext task logs (traffic_ctl config status) via CfgLoad* macros.
Each task log entry now carries a DiagsLevel severity, enabling
--min-levelfiltering intraffic_ctland severity tags in output ([Note],[Err], etc). Reload summaries are logged todiags.logafter a grace period, with detailed per-subtask dumps under the config.reload debug tag.the entire idea is to get the following sort of visibility:
Successful reload
Failed reload — SNI parse error
Failed reload — partial SSL cert failures
--min-levelfiltering — errors and warnings onlydiags.log— one-line reload summariesdiags.log— detailed dump viaconfig.reloaddebug tagSeverity tags
[Dbg]CfgLoadDbgmacro[Note]CfgLoadInProgress,CfgLoadComplete,CfgLoadLog(DL_Note),ctx.in_progress(text),ctx.complete(text)[Warn]CfgLoadFail(DL_Warning),CfgLoadLog(DL_Warning)[Err]CfgLoadFail(DL_Error),ctx.fail(text)Fixes: #12963