> For the complete documentation index, see [llms.txt](https://zahorecz-tibor.gitbook.io/product-management/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://zahorecz-tibor.gitbook.io/product-management/untitled-4/about/handbook-on-call.md).

# Handbook On-Call

1. You are here:
2. [About the Handbook](https://about.gitlab.com/handbook/about/)
3. Handbook On-Call

## On this page <a href="#on-this-page" id="on-this-page"></a>

## Introduction <a href="#introduction" id="introduction"></a>

GitLab recognizes that the Handbook is a critical part of empowering team members to do their jobs effectively. As such we have implemented a basic on-call process (refer to [First-response Service Level Objective](/product-management/untitled-4/about/handbook-on-call.md#first-response-service-level-objective) below) to ensure that someone is available to assist team members in the event that something is broken in the handbook or if they are having trouble with making updates to it.

## Reporting an issue <a href="#reporting-an-issue" id="reporting-an-issue"></a>

Any issues should be reported in the [#handbook-escalation](https://gitlab.slack.com/archives/CVDP3HG5V) channel in Slack.

If you do not get a response within the indicated [first-response SLO](/product-management/untitled-4/about/handbook-on-call.md#first-response-service-level-objective) feel free to DM the Editor team Engineering Manager or Product Manager (refer to [team page](https://about.gitlab.com/handbook/engineering/development/dev/create-editor/)).

### When to escalate an issue <a href="#when-to-escalate-an-issue" id="when-to-escalate-an-issue"></a>

Issues should only be escalated to the Handbook On-Call team if it relates to:

1. Master being broken
2. Security incidents
3. Significant broken pages in production (e.g. the values page being unreachable)
4. Broken infrastructure
5. Bugs that prevents team members from accessing important information
6. Time sensitive updates to the Handbook where there are any issue in making the update

## On-call schedule <a href="#on-call-schedule" id="on-call-schedule"></a>

Until recently members of the `Static Site Editor` team were part of the on-call process and members of the [#handbook-escalation](https://gitlab.slack.com/archives/CVDP3HG5V) channel. Additionally any GitLab team member can volunteer to join the [#handbook-escalation](https://gitlab.slack.com/archives/CVDP3HG5V) channel and help out.

We are looking into formulating alternatives and the future.

### Expectations for being on-call <a href="#expectations-for-being-on-call" id="expectations-for-being-on-call"></a>

1. Make sure you are set to receive notifications for the [#handbook-escalation](https://gitlab.slack.com/archives/CVDP3HG5V) channel
2. When an issue is reported:
   1. Acknowledge the team member and let them know you are looking into it
   2. You can check on `#production`, `#incident-management`, and `#is-this-known` to see if it's a know issue with infrastructure or other problems.
   3. Provide an update as soon as you are able to confirm their problem.
   4. You can also post updates in `#website` and/or `#handbook` as appropriate.
   5. Resolve the problem, or provide feedback to the team member on how they can resolve it.
   6. Offer to have a Zoom call to help replicate or resolve the issue if it is not straight forward.

### When to hand over to Reliability Engineering <a href="#when-to-hand-over-to-reliability-engineering" id="when-to-hand-over-to-reliability-engineering"></a>

The Handbook On-Call deals specifically with matters relating to the `www-gitlab-com` repo source code and configuration. If a reported issue relates to the GitLab product or the infrastructure running the [https://about.gitlab.com](https://about.gitlab.com/) website then it should be escalated to the Reliability Engineering team. To report an incident follow the instructions on the Incident Management page: [/handbook/engineering/infrastructure/incident-management/#reporting-an-incident](https://about.gitlab.com/handbook/engineering/infrastructure/incident-management/#reporting-an-incident)

### First-response Service Level Objective <a href="#first-response-service-level-objective" id="first-response-service-level-objective"></a>

All incidents reported in the [#handbook-escalation](https://gitlab.slack.com/archives/CVDP3HG5V) channel, during weekdays (Mon - Fri, 08:00 UTC+0 - 18:00 UTC-7), should receive an initial response of acknowledgement within 1 hour of it being reported.

## Common Incidents and Tips <a href="#common-incidents-and-tips" id="common-incidents-and-tips"></a>

### Runbook for about.gitlab.com <a href="#runbook-for-aboutgitlabcom" id="runbook-for-aboutgitlabcom"></a>

There is also a [runbook for about.gitlab.com incident handling](https://gitlab.com/gitlab-com/runbooks/-/blob/master/docs/uncategorized/about-gitlab-com.md).

### Managing broken master alerts in #handbook-escalation <a href="#managing-broken-master-alerts-in-handbook-escalation" id="managing-broken-master-alerts-in-handbook-escalation"></a>

All broken CI pipelines for the `master` branch of the `www-gitlab-com` repo are automatically posted in the Slack channel. These reports should be investigated and addressed where needed.

Once a report has been looked at, please leave a comment stating the nature of the problem, action taken and add a ✅ reaction to the message to show that it has been handled.

If for some reason there is a large amount of failures resulting in spamming the channel, the error reporting can be turned off in the repo settings: <https://gitlab.com/gitlab-com/www-gitlab-com/-/services/slack/edit>

### Merging urgent MRs <a href="#merging-urgent-mrs" id="merging-urgent-mrs"></a>

See [the description of this issue](https://gitlab.com/gitlab-com/www-gitlab-com/-/issues/6356) for details on the current workarounds required for [this bug related to the Merge Train](https://gitlab.com/gitlab-org/gitlab/-/issues/214742#note_338664758)

### Stuck Merge Train <a href="#stuck-merge-train" id="stuck-merge-train"></a>

To see the status of the merge train (useful when team members are reporting that their MRs seem 'stuck' on the train), see [this issue to check the status and perform a workaround, if necessary](https://gitlab.com/gitlab-org/gitlab/-/issues/217908#when-the-merge-train-in-the-www-gitlab-com-project-might-be-stuck).

TL;DR for workaround: If the first/oldest MR `iid` in [the FIFO list](https://gitlab.com/api/v4/projects/7764/merge_trains?scope=active\&per_page=100\&sort=asc) (`sort=asc` by ID) is actively running a pipeline and eventually gets merged, then things are moving along, just slowly. If the first one in the list isn't currently running any pipeline, remove it from the train and re-add it (it should go to the end).


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://zahorecz-tibor.gitbook.io/product-management/untitled-4/about/handbook-on-call.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
