Microsoft is committed to helping improve open source community interactions by sharing tools, services, and guidance.
This is an evolving set of resources inspired by our experiences building Microsoft's open source programs office, folks like you, and our peers in the TODO Group.
Tools and services
GitHub.com is at the forefront of today's open source movement, but the public product introduces many challenges for an enterprise to function at scale.
When interfacing with third-party services such as GitHub, it is important to be able to identify employees at the same company working together on open source.
While GitHub allows organization members to publicize their organization membership on their individual profile, there is more to know. GitHub user management solutions will offer the following capabilities:
- Employment lifecycle automation: Employees should be able to join a company organization, if authorized, without needing a manual invitation. When an employee decides to leave the company, their access to company resources and GitHub organization membership should be removed.
- User linking: Authenticating an employee with IT and also GitHub to associate the GitHub account
- Communications: By linking an employee's corporate identity with their GitHub account, comms are possible without having to ask the user to share their email address or other information on their public profile.
Self-service onboarding & management
At the corporate scale, hundreds or thousands of members in a GitHub organization requires tooling to help make joining GitHub organizations automated, instead of manual. When employees decide to leave a company, tooling should automatically revoke permissions.
At Microsoft we also use our onboarding portal as an opportunity to customize the "new repository" create experience. By moving this capability from GitHub to a Microsoft-internal portal, we're able to:
- Ask questions about release approval
- Gather license information and pre-populate new repos with the official LICENSE
- Help a user identify the teams they use most often when setting permissions
- Understand company policy and guidance
GitHub management tooling builds on the GitHub API and typically interfaces with corporate systems such as the directory. Microsoft uses a portal approach while Netflix has open sourced their Slack chat bot to do the same.
Public project presence / exploration
While GitHub offers broad search capabilities and newer features like "topics", it can be difficult to locate a single large company's open source on GitHub. By providing a repository exploration experience, we can work to show all of the company's various organizations, open source projects that have been released, and other assets. This enables people to look across the company instead of focusing on specific efforts.
Insights & data
By connecting data sources such as the GitHub event stream to big data systems can offer important insights into the health and engagement of your company's open source efforts.
Connected to the near-realtime webhook event stream from GitHub, and able to crawl the GitHub API and associated objects, ghcrawler is able to bridge data cubes and big-data storage systems such as Azure Data Lake with a complete view of a corporate GitHub presence.
Using PowerBI and other data solutions, you can gain insight into how communities are using GitHub, how users interact with projects through pull requests, issues, and more, and look for trends in the amount of time to resolve issues, interest in projects, etc.
By archiving important informaton about the company's GitHub presence, more long-term analysis options exist.
Some of the resources Microsoft archives includes:
- Repo traffic data
- Repository metadata over time (i.e. watchers or fork stats)
- GitHub user management information (who employees are) over time
GitHub services & extensions
Many great services exist for helping teams to collaborate on GitHub. An open source programs office will likely need to operate services to extend GitHub with organization-specific services such as contributor license agreements.
Contributor License Agreement (CLA)
It is vital to require a Contributor License Agreement (CLA) for open source contributions to projects.
A CLA service integrates with GitHub pull requests to provide information about the contribution status of a pull request, while also offering an opportunity for new community members to get connected to sign the necessary legal document.
At this time the Microsoft CLA tooling is proprietary and due to be replaced with a solid open source community project.
Reacting through GitHub webhooks or periodic tasks, repo linters are able to look at a project and interpret, potentially opening issues or pull requests, the health of the project. Ideally corporate projects will comfort to a certain set of requirements such as the right LICENSE file, presence of policies or governance info, etc.
In other scenarios cops and extensions might look for accidentally checked in credentials, third-party source code, or other potential red flags for a project.
Open source obligations, package publishing and more
There are a large number of services and needs for a large corporate open source programs office to help engineers to be efficient as they explore the value of open source.
Source code disclosure
Many open source licenses have requirements to publish modified source code. A source disclosure site enables employees to post new code to publish, and also for the public to explore source that is being disclosed.
Package management & publishing
Enabling engineers to efficiently ship packages on any package manager - be it NuGet for .NET, npm for Node, Maven for Java, or just GitHub repos for Go, being able to scale package management in a large corporation can take a lot of work, manual or automated.
Maintaining mirrors of public open source repos and components for disaster recovery or other purposes is an important service to offer projects taking a dependency on source control systems and services external to the company and the IT department.
Operations and process
There will always be some level of manual operations required, especially when working with new products, features, and other investments ahead of an engineering investment to enable broad use at scale. Having an operations plan for addressing internal needs, external communications, and other situations is a part of any open source office.
Having a comms plan to communicate with open source members, collaborators, project managers and even org owners can take a great deal of planning and organizing.
Component security & scanning
Having the ability to work with security data sources and disclosures, industry consultants and experts, and other resources to identify potential security concerns in components.
Intellectual property scanning
For high-value products and investments, within the advice of business leaders and their legal advisors, IP scanning is a part of enabling broad open source in a company.
Registration and release approvals
At Microsoft we think about registration (our products and services registering when they use an open source component or make a substantial contribution to an open source project) as well as information about when source code will be released as open source.
Registration (use & contribution)
Enabling users of open source in products and services a mechanism to report their open source usages. This may manifest through manual registration for small projects, or automated build-based reports of the "open source cart" involved in a project.
From a contribution standpoint, after a team is using a component, allowing them to easily submit the appropriate request or information to contribute a set of changes back to the public open source project is key.
Having a central point to get approval to release open source is another open source approval workflow. Optimizations can be designed into the system, such as allowing a large team product to get approval for a broad initiative, then skipping approvals for work within that project, might be worthwhile investments.
Every company will have its own unique policies and procedures. This section contains information about some sorts of policies to think about when running an open source programs office.
What policies to think about
The resources and policy questions can help guide company executives, legal departments, open source subject-matter experts and stakeholders to craft the best policy to help the company achieve its goals.
This resource is still under development
This information is being actively developed.
Open source community resources
Linux Foundation curriculum
Professional open source management fundamentals, inclusive speaker training, compliance basics for developers, and other courses curated by the Linux Foundation.
Open Source Guides
Collated on GitHub and initially built by GitHub, the Open Source Guides cover scenarios that any business or engineering group needs.
TODO is an open group of companies collaborating on practices, tools, and other ways to run successful and effective open source projects and programs.