User Tools

Site Tools


tamiwiki:internal:networks:tami_sre

This is an old revision of the document!


TAMI Site Reliability Engineering

This page details our efforts to keep the systems online reliable


Currently, we have a Raspberry pi in the space that runs a realtime Status webpage from UptimeRobot The Status page of our Services is here

Troubleshooting steps

The first step is to identify the nature of the root cause and whether it is related to the network or the infrastructure. Use the [https://stats.uptimerobot.com/Jx4JQiBDEZ|UptimeRobot site]] to see what could be up or down.

  • If everything is down(including Tamis IP, 82.80.54.64), is it definitely a Network issue but maybe also infra
  • If the IP address of Tami is reachable (Ping and Telnet), but the yunohost services are down, its likely just an infra issue
  • In case the stuff is still not functioning after trying all these steps, you should reach out to someone from the contact page. Please note that all the email, matrix and XMPP is hosted inside tami so you cant comm on those channels. Reach out via telegram and post in main channel

Network

Relevant Link: Physical Infra

Infra

Relevant Link: Physical Infra

If there is an issue with a single service

  • The first step is to see if you can log into yunohost admin panel
  • Then check the service at Tools > Services
  • Review the logs, restart the service if necessary and maybe share logs with yunopast into a relevant group in tamis communication channel

If there is an issue with a multiple services

  • Attempt the steps above for each services but if its all services, it might be something related to yunohost or the device it is running on
  • Try to ssh into yunohost. The password is your yunohost SSO password
    • ssh <yunohost username>, telavivmakers.space
  • Check out the output from the following services
    • sudo systemctl status nginx.service (for website issues)
    • sudo systemctl status mautrix_telegram.service (for telegram bridge issues)
  • For any errors or for any other reason, try restart the service if it is already broken)
    • sudo systemctl restart <relevant service>
  • Failing this, try look for more logs. Look up any error messages and go down the rabbit holes
    • sudo journalctl -u <relevant service>
tamiwiki/internal/networks/tami_sre.1677968259.txt.gz ยท Last modified: 2023/03/05 00:17 by 444b