12 KiB
12 KiB
output | title | author | date | ||||
---|---|---|---|---|---|---|---|
|
Two decades of Open Source | Julian Ospald | Sep 20, 2024 |
Follow the presentation
Structure of this talk
- Introduction (about me and my career)
- Open Source (what it is and its value)
- First chapter: Gentoo and package management
- Second chapter: GHCup
- Third chapter: Haskell Core Libraries
- Lessons learned
Introduction
About me
- From Germany
- Studied CS
- Haskell developer
- I love open source
Professional career
- Software Engineer in R&D (automotive industry)
- Go Backend Developer (online advertisement platform
- Haskell Developer at Capital Match (invoice financing platform in Singapore)
- Haskell Developer at IOHK (Cardano Blockchain)
- Haskell Freelancer (blockchain and others)
- Haskell Developer at Standard Chartered Bank
- Haskell Freelancer (chimney sweeper app for german businesses)
Open Source career
- Gentoo Linux developer (core team), 2012-2016
- Ebuild development (packaging)
- Code review
- Development of a git workflow
- Author of GHCup (the Haskell installer), ca. 2019
- Maintainer of Haskell core libraries: filepath, unix, os-string, file-io
- Implementation of the Abstract FilePath Proposal
- Member of the Haskell Core Libraries Comittee 2023-2026
- Haskell Influencer (Haskell Foundation, ...)
Open Source
What is Open Source
- {#id .class height=32px} A group of licenses (see OSI)
- Not free software
- Not copyleft
- 🧑🤝🧑 A community
- volunteers
- companies
- 🔮 A philosophy
- sharing
- collaboration
- transparency
Popular Open Source projects
- {#id .class height=32px} Linux kernel
- 1500 developers from 200-250 companies
- {#id .class height=32px} Firefox
- {#id .class height=32px} VSCode
- {#id .class height=32px} Blender
- {#id .class height=32px} GHC (The Haskell compiler)
Value proposition of Open Source
- ⚗️ the scientific method
- share your results
- allow people to replicate it
- 🔓 access to a community
- users
- collaborators
- 🕸️ network effects
Reality of Open Source
- most projects...
- are one-man shows
- have no users
- are underdocumened
- have horrible code
- writing new code is easy, maintenance is hard
- most maintainers
- don't get paid
- will stop maintenance at some point
- don't care much about their users
First chapter: Gentoo and package management
What is Gentoo
- a Linux distribution
- rolling release
- source based
- 19000 packages (program, library, ...)
- 200 core developers (at its peak)
- over 1000 contributors
How does a Linux distro work (relationships)
How does a Linux distro work (activities)
A typical ebuild
EAPI=8
DESCRIPTION="A dummy package"
HOMEPAGE="https://dummy.org"
SRC_URI="https://github.com/dummy/dummy/archive/refs/tags/${PV}.tar.gz -> ${P}.tar.gz"
LICENSE="BSD-3"
SLOT="0"
KEYWORDS="~amd64 ~x86 "
IUSE="debug"
RDEPEND="dev-util/boost"
PATCHES=( "${FILESDIR}"/${PN}-4.9.2-disable_python_rpath.patch)
src_configure() {
econf $(use_enable debug)
}
src_compile() {
emake
}
src_test() {
emake test
}
src_install() {
emake DESTDIR="${D}" install
}
Packaging challenges
- no standard on build systems (make, autotools, meson, cmake, ...)
- => an abstraction over build systems
- thousands of different execution environments (fragility)
- system configuration
- package configuration
- platform, architecture
- reverse dependencies
- shipping a "chain" instead of a single artifact
- high impact on small mistakes (e.g. assuming a specific shell)
Packaging challenges (pt. 2)
- communication between teams/maintainers
- execution of large changes
- e.g. introduction of LibreSSL
- e.g. changing of fundamental workflows (from CVS to git)
- monitoring upstream changes and making decisions about compatibility/stability
- when to update
What is a Distro really?
- a user experience
- LTS distros vs rolling release
- binary vs source based
- choice of init system
- plug and play (everything works)
- deviating from the happy path (fixing issues)
- combining components to a coherent system (init system, coreutils, kernel, ...)
- a choice of defaults
Programming lessons
- primary packaging skill: being meticulous
- small mistakes -> big impact
- being as precise as possible about what you want to achieve
- long term maintenance of small code pieces
- intense review culture
- strict policies and workflow guidelines
- how to learn complex system
Second chapter: GHCup
Demo
State of 2019 (Haskell Installers)
- stack is the only "Haskell Installer"
- no unified alternative for cabal users
- distro packages, nix, manual installs, ...
- 😭
How it started
- 🤹 small team at work (Capital Match), using different platforms
- originally used stack
- distro packages constantly out of date
- 🦾 first version was 165 LOC
- Posix shell
- {#id .class height=32px} only supported linux and mac
- {#id .class height=32px} inspired by rustup
- support from haskell.org
GHCup today
- over 17k LOC Haskell
- supports all platforms: Linux, Windows, macOS, FreeBSD
- first thing new Haskell users get exposed to
What is GHCup (simplified)?
curl -s -L \
'https://downloads.haskell.org/~ghc/9.6.5/ghc-9.6.5-x86_64-fedora33-linux.tar.xz' |
tar -xJ -C /tmp &&
cd /tmp/ghc-9.6.5-x86_64-unknown-linux/ &&
./configure --prefix="$HOME/.local" &&
make install &&
rm -rf /tmp/ghc-9.6.5-x86_64-unknown-linux/
What is GHCup really?
- {#id .class width=32 height=32px} installer (portable)
- {#id .class width=32 height=32px} distribution channel
- {#id .class width=32 height=32px} feedback channel
- {#id .class width=32 height=32px} testing/QA gateway
- {#id .class width=32 height=32px} provider of sane defaults (e.g. "recommended" GHC version)
- {#id .class width=32 height=32px} glue for holistic toolchain experience
- VSCode, stack, cabal-install integration
- {#id .class width=32 height=32px} CI provisioning (e.g. github actions)
Relationships in detail
Dependencies:
- supported tools
- GHC
- Cabal
- HLS
- Stack
- decisions that affect us
- release frequency
- upstream CI
- platform support
- binary distributions (the
.tar.gz
/.zip
)
Relationships in detail (pt. 2)
Dependents:
- {#id .class height=32px} Haskell developers
- beginners, advanced, students, companies
- {#id .class width=32 height=32px} end users (e.g. compiling pandoc from source)
- {#id .class width=32 height=32px} GitHub CI
- GitHub images, Haskell repos
- 🪞 mirrors
- 🧰 tools
Programming lessons
- writing a small single-purpose program from scratch
- how to design command line interfaces
- high impact of decisions (not just mistakes)
- bugs now affect GitHub CI and companies
- can make "Haskell" look bad
- no one to review
- => review your own code
- constantly thinking about ways to improve reliability
- can't rely on anyone else to catch bugs
The difference to Gentoo
Both deal with installation, but...
- more code to maintain (not just packaging) for me
- one-man project (mostly)
- much tighter coupling between upstream (e.g. GHC developers) and downstream (GHCup developers)
- heavier on relationship issues
- less dependencies, but much more responsibility
- position of authority
- what to consider?
- most of my work today is support
Third Chapter: Haskell Core Libraries
What are Haskell Core Libraries?
- bundled with the compiler
- fundamental building blocks (primitives)
- base library
- available to all programs by default
- contains the "Prelude" (standard library)
Core libraries I maintain
- filepath
- unix
- os-string
- file-io
Challenges
- changes are extremly expensive
- writing good primitives is hard (non-specific APIs)
- lots of odd knowledge
- e.g. Windows filepaths
C:foo
/bar
\\?\GLOBALROOT\Device\Harddisk0\Partition2\foo\bar
- Posix standard
- e.g. Windows filepaths
- portability
Core Libraries Committee
- 7 members
- manages API changes of
base
only- requires a proposal
- requires impact assessment for breaking changes
- requires an up-front implementation of the change
- ensures other core libraries have active maintainers
- does not interfere with maintenance
Driving changes across core libraries (case study)
Abstract FilePath Proposal:
- Haskell String type:
type String = [Char]
Char
is a unicode code point- not bytes
- is interpreted (decoded)
- depends on locale
- affects most core libraries
- implement as a breaking change (base), or...
- in "user-space"
- lack of higher authority
- building consensus
- convincing multiple maintainers
- patching many libraries
- open source politics
Programming lessons
- how to design good primitives
- as opposed to abstractions
- considering every impact of API changes
- doing history research on past design choices
- important design decisions may not be documented
- may look innocent
- chaning them might be devastating
Lessons Learned
Collaboration
::: incremental
- main currency in Open Source is energy
- treat contributors like kings
- be mindful about boundaries (tricky balance)
- respect other projects workflows
- driving large changes requires
- consensus
- support
- a good value proposition
- a good execution plan (risk, breakage, ...)
- Haskell Foundation
:::
Project maintenance
::: incremental
- dicatorships work, but are not sustainable
- plan for your departure
- bus factor is your constant enemy
- good decision making processes
- lightweight when risk is low
- elaborate when risk is high
- actively think about the contribution experience
- comment early
- how to maintain the project vision?
:::
User Experience
::: incremental
- UX is harder than CS
- yet often an afterthought
- toolchains often lack a holistic UX vision
- UX vision gets easily lost in "maintenance mode"
- feature creep
- maintainer turnover
- collective decisions
- UX is a fascinating problem (e.g. OS)
- plug & play (intuition... about interfaces)
- happy path (control)
- defaults (expectations... about behavior)
:::
Stability vs Progress
- 🤼 conflicting goals
- ⚔️ breaking API can have large rippling effects
- experience report of a facebook engineer on GHC upgrades
- 🗼 small breakages add up
- large projects have hundreds of dependencies
- 🗣️ many discussions in the Haskell community
- upgrade cost
- language changes (Haskell report)
- academic background of Haskell (academia vs industry?)
- the role of committees
- ⚖️ how to strike a balance?
- SemVer does not solve it (why?)
Composition
- I love small programs
- categories of composition
- functions
- libraries
- programs
- unix philosophy
- 🛠️ do one thing and do it well
- ⚗️ pipes, compose stdout and stdin (re-usable)
- how to make your project composable?