What happens when webhook ongoing verification times out or gets a non-200 response

Options
sgolux
sgolux ✭✭
edited 04/11/23 in API & Developers

Hello -

Your docs state that after every 100 callbacks is delivered to a webhook callback URL, there will be an ongoing verification request, similar to the original verification request. If that request happens to land when the server on the callback URL is down (which would lead to a timeout) or perhaps is in some kind of maintenance mode where the route is not available, does this ongoing verification request also follow an exponential backoff/retry strategy like event callbacks, or is the webhook disabled immediately? Or is it something else?

Best Answer

Answers

  • Genevieve P.
    Genevieve P. Employee Admin
    Answer ✓
    Options

    Hey @sgolux

    If the callback URL is down or does not respond to the callback validation request, then the webhook is disabled immediately. The webhook can be re-enabled with a webhook update. Here's the section in the documentation that talks about webhooks.

    Cheers!

    Genevieve

  • sgolux
    sgolux ✭✭
    Options

    Hi @Genevieve P. thanks for the response. This is disappointing, because we do occasionally need to take our site down for maintenance. We anticipate having hundreds of thousands of webhooks, and no way to know which of them were disabled by virtue of a validation request during the maintenance interval. Which means on restart, we will need to iterate through them all, and either simply force-reenable them all, or ask for their status and re-enable any that are disabled. This will be a bit of a drag on our system, but also on yours, so that is sad. Unless you have any further suggestion.....

    Also curious whether it makes a difference in load on your side to issue (say) 100,000 "enable webhook" API calls, or to issue 100,000 "get status of webhook" API calls, and then only enable the few that are disabled at that point.

  • Genevieve P.
    Genevieve P. Employee Admin
    Options

    Hi @sgolux

    If you're centralizing ownership to a "service account", then you would be able to very quickly take an accounting of all disabled webhooks with minimal load on our server by using List Webhooks. This would get a list of all webhooks (and their statuses) owned by the user who made the API call.

    However you would need to do that for each distinct webhook owner, and with 100,000 webhooks you would also want to use pagination to return 10,000 at a time.

    I've seen it be standard practice to have a daily task that does exactly this: list all webhooks, re-enable anything that's been disabled that shouldn't be, and create/recreate anything missing.

    I hope that helps!

    Genevieve

  • sgolux
    sgolux ✭✭
    Options

    Cool thanks. We won't have ownership centralized, as we will be working in each of our (mutual) customers' smartsheet instances separately... but I can iterate per customer. It will just be a bit messy....

  • sgolux
    sgolux ✭✭
    Options

    As a suggestion, I might offer that your ongoing verification callbacks could at least be retried 3 or 4 times before you disable if the response is a timeout..... that way you would have fewer unnecessary disables at times of transient network instability, which could be anywhere along the path....