Skip to main content

Why SBOM Generators Need to Accurately Represent Open Source Licenses

By June 20, 2023November 13th, 2023Blog, Guest Blog
OpenSSF Why SBOM Generators Need to Accurately Represent Open Source Licenses

By Surendra Pathak, Interlynk

Post originally appeared on Medium, and has been revised for the OpenSSF.

The pace of software development has been accelerating for decades. A crucial part of acceleration can be attributed to the continuous invention of newer programmable constructs – higher-level programming languages, compilers, packages, and frameworks, as well as composable units that are free and open source. These components provide the building blocks for a wide variety of typical application interactions – data storage, authentication, API interfaces,  encryption, logging, and even machine learning models. Each open-source component may become a building block for the next generation of applications – thus accelerating innovation.

Open Source licenses play the crucial role of enabling and managing this collaboration, innovation, and transparency. Licensing also impacts security; a project with uncertain licensing will typically not receive the security review and maintenance it otherwise would. As the software supply chains become more complex by the day, it is increasingly essential to start building visibility into the components and their licenses that support automation instead of just manual processes.

This is where the Software Bill of Materials (SBOM) comes in. An SBOM is an “ingredients list” that identifies what components are in a software product.  Many users want SBOMs to help them improve their security, e.g., by helping recipients identify (1) reused components with known vulnerabilities and (2) the products in their environment that use a component with an unusually dangerous vulnerability. However, SBOMs can also help users meet their licensing obligations. SBOMs enable organizations to identify vulnerabilities, track open-source usage, and ensure compliance with numerous licensing obligations. Having a “single source of truth” for security and licensing information helps everyone. Let’s take a look at why SBOM Generators need to accurately represent Open Source Licenses.

Open Source Licenses

Open Source licenses define the terms and conditions under which software can be used, modified, and distributed. When considering Open Source components for software, understanding and inventorying licenses associated with those components must be part of the selection criteria. Different licenses have varying requirements and restrictions. Permissive licenses like MIT and BSD allow for more flexibility, while copyleft licenses like GPL impose stricter obligations, such as sharing modifications under the same license.

SBOM and Open Source Licenses

Including Open Source licenses in SBOMs helps organizations accurately track the licenses associated with their software components. This creates a more open and scalable path for managing license compliance risks and meeting necessary obligations.

  1. Ensuring compliance with Open Source licenses: By including licenses in SBOM, organizations can easily keep an inventory of the licenses of the components used in their software projects. This helps ensure compliance with the licenses’ terms and conditions, avoiding legal ramifications or delays in product sales and M&A.
  2. Mitigating legal risks: Failure to comply with Open Source licenses can result in legal disputes and potential loss of intellectual property rights. Including licenses in SBOM allows organizations to proactively address licensing issues and mitigate legal risks associated with the usage.
  3. Promoting transparency and trust: Open Source licenses in SBOM foster transparency and trust within software supply chains. By openly declaring the licenses, organizations demonstrate their commitment to complying with their licensing obligations and build trust with users, customers, and the broader Open Source community.

License Placement in SBOMs

CycloneDX and SPDX — two of the three most commonly used and NTIA-recommended SBOM specifications — provide precise and easy-to-implement licensing details in an SBOM.

SPDX: The SPDX License List is a list of the most commonly used licenses in open-source software and components, extensions of these licenses, and exceptions to them. License tracking was, according to some, the primary rationale for SPDX. As of version 3.20, the list includes 506 licenses and derivatives. The SPDX License list is a crucial part of the SPDX SBOM Specification. The list can be used to form larger license-expression by following specific rules:

SPDX: License Expressions

The license-expression forms the backbone of license declarations, such as declaring licenses for Packages by the authors of the package, e.g.:

PackageLicenseDeclared: (LGPL-2.0-only AND LicenseRef-3)

Governing licenses as concluded by the authors can be expressed, e.g.:

PackageLicenseConcluded: (LGPL-2.0-only OR LicenseRef-3)

Comments or analysis can be provided as background information, e.g.:

PackageLicenseComments: <text>The license for this project changed with the release of version 1.4. 
The version of the project included here post-dates the license change.</text>

CycloneDX:

CycloneDX SBOM Specification also uses the SPDX License list and SPDX license-expression and also provides fields to populate links to license text. (e.g.

CycloneDX: License Metadata

Known Challenges with Understanding Licenses from SBOM:

While specifications have described license declaration methodically, implementation in SBOM generators has lacked the same rigor. This has led to some challenges:

  1. Absent Licenses: The absence of license fields from the SBOM is the most common challenge. While it is accurate that CycloneDX and the latest version of SPDX have kept License fields optional (for broader use cases), NTIA Minimum Elements for a SBOM has recommended the inclusion of licenses to make them useful.
  2. Invalid Default: SDPX provides two defaults: NONE for when no license applies and NOASSERTION for when license discovery was not attempted, or the license is intentionally not included in the document. However, a number of invalid defaults — EMPTY, NULL, NOLICENSE, UNKNOWN can be found in SBOM generated by various tools. This challenges the SBOM consumption tools.
  3. Invalid License-References: An SPDX simple license-expression can include a user-defined license reference. However, to make this useful for the SBOM consumption tool, this indirection requires: (a) a valid license-reference id and (b) an accurate description of the license reference itself. At Interlynk, we see various errors in such license expressions, including (a) Invalid characters in license references, (b) dangling license references, (c) non-terminating license references, (d) License-reference for external documents without providing the external documents themselves.
  4. Invalid License Expressions: Including licenses in the ABNF referenced above is the only sure way of ensuring automated processes can parse SBOM accurately and extract the licensing information. However, we have noticed very little adherence to sticking to the ABNF and instead relying on various short-hands that make licenses easy to read by humans (and therefore getting presumption of accuracy) while causing errors for consumption tools. Examples of invalid license expressions include

PackageDeclaredLicense: MIT, LGPLv2, BSD
PackageConcludedLicense: [MIT, LGPLv2, BSD]
PackageConcludedLicense: [MIT, LGPLv2, BSD]
LicenseID: MIT/LGPLv2

To overcome these challenges, it is currently crucial to pre-process SBOM with quality (e.g., sbomqs) or data extract tools (e.g., sbomgr) to understand the accuracy and completeness of the underlying SBOM and the maturity of the SBOM generator.

There are currently efforts to improve the quality of licensing information. The Open Source Initiative (OSI) ClearlyDefined project is working to help OSS projects clearly define key information, such as licenses. Tools such as sbom-scorecard can automatically evaluate SBOMs to identify missing information such as license information.

Best Practices for Managing Open Source Licenses:

  1. Implement a license compliance strategy: Establish a clear policy for managing Open Source licenses within your organization. This includes defining acceptable licenses, conducting regular license reviews, and educating developers about license obligations.
  2. Conduct regular license audits and scans: Regularly audit your software projects to ensure compliance with Open Source licenses. Utilize scanning tools that automatically identify and report the licenses associated with your software components.
  3. Utilize tools and automation: Leverage SBOM tools that automatically generate SBOMs and provide insights into license compliance. These tools can automate and streamline the process and ensure accuracy in managing Open Source licenses.
  4. Engage legal expertise: In complex licensing scenarios, consider involving legal expertise to navigate any potential legal challenges. Legal professionals can provide guidance on license compatibility and obligations and help resolve any licensing issues that may arise.

While there are known challenges with the placement and quality of open-source licenses in SBOMs,  the situation is improving every day. This makes SBOMs the most straightforward way of tracking and monitoring Open Source licenses for most organizations. 

About the Author

Surendra PathakSurendra Pathak is the Founder and CEO of Interlynk – a platform for managing Software Supply Chain security. Surendra has over twenty years of experience leading application, risk, security, and DevOps teams.

 

This post represents the views of the authors & does not necessarily reflect those of all OpenSSF members.