SCANOSS is a lightweight software composition analysis tool designed for fast, accurate detection of open-source components in codebases. The Python library is MIT licensed and available on GitHub, supporting both gRPC and REST API interactions.
It combines local scanning with the Open Source Software Knowledge Base (OSSKB) to identify dependencies, detect code snippets, and verify license compliance.
What is SCANOSS?
SCANOSS takes a different approach to SCA by focusing on code-level detection rather than just manifest file parsing.
The scanner analyzes actual source files to find matches against millions of open-source projects, catching dependencies that package managers miss.
This includes vendored code, copy-pasted snippets, and components without clear attribution.
The OSSKB contains fingerprints from over 100 million open-source files, enabling accurate identification even for partial code matches.
SCANOSS runs scans locally using efficient algorithms, then queries the cloud database only for identified components.
This architecture balances speed with comprehensive detection.
Key Features
Snippet Detection
Beyond whole-file matching, SCANOSS detects code snippets embedded in your source files.
If a developer copied a function from Stack Overflow or an open-source project, the scanner identifies the source and associated license.
This catches compliance issues that traditional SCA tools overlook.
SBOM Generation
Generate Software Bill of Materials in standard formats including SPDX and CycloneDX.
The SBOM includes component names, versions, licenses, and provenance information.
Automated SBOM creation supports compliance with emerging regulations requiring software transparency.
License Compliance
SCANOSS maps detected components to their licenses and highlights conflicts with your project’s licensing policy.
The tool supports custom license policies, flagging copyleft licenses, commercial restrictions, or other terms that require legal review.
Cryptographic Detection
Identify cryptographic algorithms and implementations in your codebase.
This helps with export compliance (cryptography export controls) and security audits requiring inventory of cryptographic usage.
Dependency Graph
Visualize the relationship between your code and open-source components.
The dependency graph shows direct dependencies, transitive dependencies, and how components relate to each other.
Installation
Command Line Scanner
Install the Python-based scanner:
pip install scanoss
Or use the standalone binary for faster scanning:
# Linux
curl -LO https://github.com/scanoss/scanner.c/releases/latest/download/scanoss-linux-amd64
chmod +x scanoss-linux-amd64
sudo mv scanoss-linux-amd64 /usr/local/bin/scanoss
# macOS
brew install scanoss/tap/scanoss
Docker
docker pull scanoss/scanoss-py
docker run -v $(pwd):/code scanoss/scanoss-py scan /code
How to Use SCANOSS
Basic Scanning
Scan a directory and output results:
scanoss-py scan ./src --output results.json
Generate an SPDX SBOM:
scanoss-py scan ./src --format spdx --output sbom.spdx.json
Generate a CycloneDX SBOM:
scanoss-py scan ./src --format cyclonedx --output sbom.cdx.json
Configuration File
Create a scanoss.json configuration:
{
"scan": {
"output": "results.json",
"format": "json",
"threads": 4
},
"ignore": [
"node_modules",
"vendor",
"*.test.js"
],
"policies": {
"deny_licenses": ["GPL-3.0", "AGPL-3.0"],
"require_approval": ["LGPL-2.1"]
}
}
Run with configuration:
scanoss-py scan ./src --settings scanoss.json
API Integration
from scanoss import Scanner, ScanossApi
# Initialize scanner
scanner = Scanner()
# Scan a file
results = scanner.scan_file("path/to/source.py")
for match in results:
print(f"Component: {match['component']}")
print(f"Version: {match['version']}")
print(f"License: {match['license']}")
print(f"Match: {match['matched']}%")
Comparing Codebases
Detect code shared between projects:
scanoss-py compare ./project-a ./project-b --output overlap.json
Integration
GitHub Actions
name: SCANOSS SCA
on: [push, pull_request]
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run SCANOSS Scan
uses: scanoss/scanoss-github-action@v1
with:
output: results.json
sbom-format: spdx
- name: Upload SBOM
uses: actions/upload-artifact@v4
with:
name: sbom
path: sbom.spdx.json
- name: Check for policy violations
run: |
if grep -q "GPL-3.0" results.json; then
echo "GPL-3.0 component detected - review required"
exit 1
fi
GitLab CI
scanoss:
image: scanoss/scanoss-py:latest
script:
- scanoss-py scan . --output results.json
- scanoss-py scan . --format cyclonedx --output sbom.cdx.json
artifacts:
paths:
- results.json
- sbom.cdx.json
Jenkins Pipeline
pipeline {
agent any
stages {
stage('SCANOSS Scan') {
steps {
sh 'pip install scanoss'
sh 'scanoss-py scan . --output scanoss-results.json'
}
}
stage('Check Licenses') {
steps {
script {
def results = readJSON file: 'scanoss-results.json'
// Custom license policy check
}
}
}
}
post {
always {
archiveArtifacts artifacts: 'scanoss-results.json'
}
}
}
When to Use SCANOSS
SCANOSS fits well when you need to:
- Detect open-source code beyond declared dependencies
- Find copy-pasted snippets that create license obligations
- Generate accurate SBOMs for regulatory compliance
- Audit acquisitions or third-party code contributions
- Run fast scans in CI/CD without heavy infrastructure
The tool excels at catching undeclared open-source usage that package-manager-based scanners miss.
Development teams who vendor dependencies, copy reference implementations, or accept external code contributions benefit from snippet-level detection.
For projects with well-maintained package manifests, traditional SCA tools may suffice.
SCANOSS adds value when you need deeper visibility into actual code provenance and want to catch compliance issues before they become legal problems.