--- output: slidy_presentation: duration: 45 title: Two decades of Open Source author: Julian Ospald date: Sep 20, 2024 --- ## Follow the presentation {.centered} ![](QR.png){#id .class height=500px} ## Structure of this talk 1. Introduction (about me and my career) 2. Open Source (what it is and its value) 3. First chapter: Gentoo and package management 4. Second chapter: GHCup 5. Third chapter: Haskell Core Libraries 6. Lessons learned # Introduction ## About me * From Germany * Studied CS * Haskell developer * I love open source ## Professional career * Software Engineer in R&D (automotive industry) * Go Backend Developer (online advertisement platform * Haskell Developer at **Capital Match** (invoice financing platform in Singapore) * Haskell Developer at **IOHK** (Cardano Blockchain) * Haskell Freelancer (blockchain and others) * Haskell Developer at **Standard Chartered Bank** * Haskell Freelancer (chimney sweeper app for german businesses) ## Open Source career * Gentoo Linux developer (core team), 2012-2016 - Ebuild development (packaging) - Code review - Development of a [git workflow](https://www.gentoo.org/glep/glep-0066.html) * Author of GHCup (the Haskell installer), ca. 2019 * Maintainer of Haskell core libraries: filepath, unix, os-string, file-io * Implementation of the [Abstract FilePath Proposal](https://hasufell.github.io/posts/2022-06-29-fixing-haskell-filepaths.html) * Member of the Haskell Core Libraries Comittee 2023-2026 * Haskell Influencer (Haskell Foundation, ...) # Open Source ## What is Open Source * ![](osi.png){#id .class height=32px} A group of licenses (see OSI) * *Not* free software * *Not* copyleft * ๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘ A community * volunteers * companies * ๐Ÿ”ฎ A philosophy * sharing * collaboration * transparency ## Popular Open Source projects * ![](512px-Tux.svg.png){#id .class height=32px} Linux kernel * 1500 developers from 200-250 companies * ![](firefox.png){#id .class height=32px} Firefox * ![](vscode.png){#id .class height=32px} VSCode * ![](blender.png){#id .class height=32px} Blender * ![](haskell_logo.png){#id .class height=32px} GHC (The Haskell compiler) ## Value proposition of Open Source * โš—๏ธ the scientific method * share your results * allow people to replicate it * ๐Ÿ”“ access to a community * users * collaborators * ๐Ÿ•ธ๏ธ network effects * OSS community adopts fast * compare with git ## Reality of Open Source * most projects... * are one-man shows * have no users * are underdocumened * have horrible code * writing new code is easy, maintenance is hard * most maintainers * don't get paid * will stop maintenance at some point * don't care much about their users # First chapter: Gentoo and package management ## What is Gentoo * a Linux distribution * rolling release * source based * 19000 packages (program, library, ...) * 200 core developers (at its peak) * over 1000 contributors ## How does a Linux distro work (relationships) ![](packager_relationships.svg){#id .class height=500px} ## How does a Linux distro work (activities) ![](install.svg){#id .class height=500px} ## A typical ebuild ```bash EAPI=8 DESCRIPTION="A dummy package" HOMEPAGE="https://dummy.org" SRC_URI="https://github.com/dummy/dummy/archive/refs/tags/${PV}.tar.gz -> ${P}.tar.gz" LICENSE="BSD-3" SLOT="0" KEYWORDS="~amd64 ~x86 " IUSE="debug" RDEPEND="dev-util/boost" PATCHES=( "${FILESDIR}"/${PN}-4.9.2-disable_python_rpath.patch) src_configure() { econf $(use_enable debug) } src_compile() { emake } src_test() { emake test } src_install() { emake DESTDIR="${D}" install } ``` ## Packaging challenges * no standard on build systems (make, autotools, meson, cmake, ...) * => an abstraction over build systems * thousands of different execution environments (fragility) - system configuration - package configuration - platform, architecture * reverse dependencies - shipping a "chain" instead of a single artifact * high impact on small mistakes (e.g. assuming a specific shell) ## Packaging challenges (pt. 2) * communication between teams/maintainers * execution of large changes - e.g. introduction of LibreSSL - e.g. changing of fundamental workflows (from CVS to git) * monitoring upstream changes and making decisions about compatibility/stability - when to update ## What is a Distro really? * a user experience - LTS distros vs rolling release - binary vs source based - choice of init system * plug and play (everything works) * deviating from the happy path (fixing issues) * combining components to a coherent system (init system, coreutils, kernel, ...) * a choice of **defaults** ## Programming lessons * primary packaging skill: being meticulous * small mistakes -> big impact * being as precise as possible about what you want to achieve * long term maintenance of small code pieces * intense review culture * strict policies and workflow guidelines * how to learn complex system # Second chapter: GHCup ## Demo {.centered} ![](ghcup.png){#id .class height=500px} ## State of 2019 (Haskell Installers) * stack is the only "Haskell Installer" * no unified alternative for cabal users * distro packages, nix, manual installs, ... * ๐Ÿ˜ญ ## How it started * ๐Ÿคน small team at work (Capital Match), using different platforms - originally used stack - distro packages constantly out of date * ๐Ÿฆพ first version was 165 LOC - Posix shell * ![](linux.png){#id .class height=32px} only supported linux and mac * ![](rust.png){#id .class height=32px} inspired by **rustup** * support from haskell.org ## GHCup today [Haskell Survey 2022](https://taylor.fausak.me/2022/11/18/haskell-survey-results/#s2q1): ![](survey.png) - over **17k** LOC Haskell - supports all platforms: Linux, Windows, macOS, FreeBSD - first thing new Haskell users get exposed to ## What is GHCup (simplified)? ```sh curl -s -L \ 'https://downloads.haskell.org/~ghc/9.6.5/ghc-9.6.5-x86_64-fedora33-linux.tar.xz' | tar -xJ -C /tmp && cd /tmp/ghc-9.6.5-x86_64-unknown-linux/ && ./configure --prefix="$HOME/.local" && make install && rm -rf /tmp/ghc-9.6.5-x86_64-unknown-linux/ ``` ## What is GHCup really? * ![](open-box.png){#id .class width=32 height=32px} installer (portable) * ![](debian.png){#id .class width=32 height=32px} distribution channel * ![](feedback.png){#id .class width=32 height=32px} feedback channel * ![](qa.png){#id .class width=32 height=32px} testing/QA gateway * ![](user.png){#id .class width=32 height=32px} provider of sane defaults (e.g. "recommended" GHC version) * ![](chain-saw.png){#id .class width=32 height=32px} glue for holistic toolchain experience - VSCode, stack, cabal-install integration * ![](ghaction.png){#id .class width=32 height=32px} CI provisioning (e.g. github actions) ## Relationships in detail Dependencies: - supported tools - GHC - Cabal - HLS - Stack - decisions that affect us - release frequency - upstream CI - platform support - binary distributions (the `.tar.gz`/`.zip`) ## Relationships in detail (pt. 2) Dependents: - ![](haskell_logo.png){#id .class height=32px} Haskell developers - beginners, advanced, students, companies - ![](person.png){#id .class width=32 height=32px} end users (e.g. compiling pandoc from source) - ![](ghaction.png){#id .class width=32 height=32px} GitHub CI - GitHub images, Haskell repos - ๐Ÿชž mirrors - [sjtug](https://mirror.sjtu.edu.cn/docs/ghcup) - ๐Ÿงฐ tools - [vscode-haskell](https://github.com/haskell/vscode-haskell), [Haskell playground](https://play.haskell.org/), [nvim-lsp-installer](https://github.com/williamboman/nvim-lsp-installer) ## Programming lessons * writing a small single-purpose program from scratch * how to design command line interfaces * high impact of decisions (not just mistakes) - bugs now affect GitHub CI and companies - can make "Haskell" look bad * no one to review - => review your own code * constantly thinking about ways to improve reliability - can't rely on anyone else to catch bugs ## The difference to Gentoo Both deal with installation, but... * more code to maintain (not just packaging) for me * one-man project (mostly) * much tighter coupling between upstream (e.g. GHC developers) and downstream (GHCup developers) * heavier on relationship issues * less dependencies, but much more responsibility * position of authority - what to consider? * most of my work today is support # Third Chapter: Haskell Core Libraries ## What are Haskell Core Libraries? * bundled with the compiler * fundamental building blocks (primitives) * base library - available to all programs by default - contains the "Prelude" (standard library) ## Core libraries I maintain * filepath * unix * os-string * file-io ## Challenges * changes are extremly expensive * writing good primitives is hard (non-specific APIs) * lots of odd knowledge - e.g. Windows filepaths - `C:foo` - `/bar` - `\\?\GLOBALROOT\Device\Harddisk0\Partition2\foo\bar` - Posix standard * portability ## Core Libraries Committee * 7 members * manages API changes of `base` only * requires a proposal * requires impact assessment for breaking changes * requires an up-front implementation of the change * ensures other core libraries have active maintainers * does not interfere with maintenance ## Driving changes across core libraries (case study) Abstract FilePath Proposal: - Haskell String type: `type String = [Char]` - `Char` is a unicode code point - not bytes - is interpreted (decoded) - depends on locale - affects most core libraries - implement as a breaking change (base), or... - in "user-space" - lack of higher authority - building consensus - convincing multiple maintainers - patching many libraries - open source politics ## Programming lessons * how to design good primitives - as opposed to abstractions * considering every impact of API changes * doing history research on past design choices - important design decisions may not be documented - may look innocent - chaning them might be devastating # Lessons Learned ## Collaboration ::: incremental * main currency in Open Source is energy * treat contributors like kings * be mindful about boundaries (tricky balance) * respect other projects workflows * driving large changes requires - consensus - support - a good value proposition - a good execution plan (risk, breakage, ...) * Haskell Foundation ::: ## Project maintenance ::: incremental * dicatorships work, but are not sustainable * plan for your departure * bus factor is your constant enemy * good decision making processes - lightweight when risk is low - elaborate when risk is high * actively think about the contribution experience - comment early * how to maintain the project vision? ::: ## User Experience ::: incremental * UX is harder than CS * yet often an afterthought * toolchains often lack a holistic UX vision * UX vision gets easily lost in "maintenance mode" * feature creep * maintainer turnover * collective decisions * UX is a fascinating problem (e.g. OS) - plug & play (intuition... about interfaces) - happy path (control) - defaults (expectations... about behavior) ::: ## Stability vs Progress * ๐Ÿคผ conflicting goals * โš”๏ธ breaking API can have large rippling effects - experience report of a facebook engineer on GHC upgrades * ๐Ÿ—ผ small breakages add up - large projects have hundreds of dependencies * ๐Ÿ—ฃ๏ธ many discussions in the Haskell community - upgrade cost - language changes (Haskell report) - academic background of Haskell (academia vs industry?) - the role of committees * โš–๏ธ how to strike a balance? - SemVer does not solve it (why?) ## Composition * ๐Ÿ’œ I love small programs * categories of composition - ฮป: functions - ๐Ÿ“š libraries (the issue with types) - โš™๏ธ programs * unix philosophy - ๐Ÿ› ๏ธ do one thing and do it well - โš—๏ธ pipes, compose stdout and stdin (re-usable) * how to make your project composable? - [https://clig.dev/](https://clig.dev/) ## Questions/Arguments? {.centered} ![](jiddu.jpg){#id .class height=500px}