User Tools

Site Tools


infra:hscloud

hscloud

VMs are coming any day now ~q3k, A.D. 2018

Our new internal highly-available Infrastructure/Platform-as-a-Service.

This runs in our datacenter (dcr01 on netbox). This is different from our ISP services or internal machines.

Components

Currently hscloud is made up of two Kubernetes clusters named:

  • k0.hswaw.net - it runs on the following machines: dcr01s22, dcr01s24, dcr03s16. It has 144 x86 cores and 368GB of RAM.
  • k1.hswaw.net (new, wip, more details later). It runs on the following machines: dcr03s19n01, dcr03s19n02, dcr03s19n03. It has 144C/288T in x86 cores (EPYC 7451), and 386 GB of RAM.

We also have a half PB of storage on old SAS drives (most of it currently cold), accessible via Ceph (radosgw or Kubernets PersistentVolumes) across two Ceph cluster: waw3 (old, main, currently ~120TB), waw4 (new, wip, currently ~60TB).

Tenants

Most services have been migrated from our old machines into Kubernetes. Exceptions are listed below.

You are also free to host your own personal stuff there within reason. See below for access.

Boston Evacuation Aktion

Here's a list of services that currently live on Boston Packets, but we'd like to migrate to hscloud. Ask on #infra on how to contribute.

  • https://owncloud.hackerspace.pl - work in progress.
  • mailman (https://lists.hackerspace.pl) - the web service is already on k8s, but also proxied via boston. Mailman-core and the database (used by both web and core, postgres) are still on boston. (2025-01: on implr's todo list)
  • ldap/kerberos - (hard) (2025-01: infowski has a WIP ldap replica)
  • email services (exim, dovecot) - (hard)
  • mun (irc bot)
  • `~user` dir serving
  • sensitivefilter.py

k0 Evacuation Aktion

List of services (listed as their kube namespace names) to (eventually) move to k1

## Ready to migrate
     
personal-*
nextcloud                   [ ready to prepare ]
mailman-hackerspace-prod    
mastodon-hackerspace-{prod,qa}
codehosting-prod            [ prepared, copy data and switch over when ready ]
gerrit                      
matrix{,-0x3c}              
kasownik                    [ prepared, copy data and switch over when ready ]
monitoring-global

## Already migrated

blog                        
depotview                   
gallery                     
hackdoc                     
home                        
oodviewer-prod              
teleimg                     
personal-radex
onlyoffice-prod             
labelmaker                  
printservant                
zhp-site  
internet                    
speedtest                   
walne     
ood                 
sourcegraph                  
inventory                   
wiki                        
paperless                   
redmine                     
site                        
capacifier                  
ldapweb                     
sso                         
registry                    
roundcube
asterisk                   

List of unfinished business before we can make k1 production-grade and retire k0:

  • Storage: Migrate all ceph rbd data
  • Storage: Migrate ceph rgw
  • Storage: Add benji
  • Implement PSP equivalent
  • Migrate remaining services
  • Add global/default NetworkPolicies

Also todo to improve k1 reliability:

  • more reliable LoadBalancer routing (fuckin' calico maaan)

Monitoring

Documentation, Getting Access and Usage

Self-documenting in hackdoc (hscloud documentation stored within hscloud): https://hackdoc.hackerspace.pl/doc/codelabs/index.md

Resources/Services

dcr03s19

This machine is a Supermicro A+ Server 2123BT-HTR, made up of 4 H11DST-B nodes. Each has 2x 24-core 48-thread AMD EPYC CPUs.

Nodes a,b,c (dcr03s19n{01,02,03}) make up the k1 cluster. The final node is Snowflake, a VM machine (hosting, among other things, Boston Packets).

Manuals:

mnl-2069.pdf mnl-2051.pdf

infra/hscloud.txt · Last modified: by radex

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki