GitHub Action
Phylum Analyze PR
Phylum Analyze PR
Phylum Analyze PR
Analyze dependencies in a pull request with Phylum
Installation
Copy and paste the following snippet into your .yml file.
- name: Phylum Analyze PR
uses: phylum-dev/phylum-analyze-pr-action@v2.1.1
Phylum Analyze PR action
A GitHub Action to analyze dependencies with Phylum to protect your code against increasingly sophisticated attacks and get peace of mind to focus on your work.
Overview
Phylum provides a complete risk analyis of "open-source packages" (read: untrusted software from random Internet strangers). Phylum evolved forward from legacy SCA tools to defend from supply-chain malware, malicious open-source authors, and engineering risk, in addition to software vulnerabilities and license risks. To learn more, please see our website.
Once configured for a repository, this action will provide analysis of project dependencies from lockfiles or manifests during a Pull Request (PR) and output the results as a comment on the PR. The CI job will return an error (i.e., fail the build) if any of the newly added/modified dependencies from the PR fail to meet the established policy.
There will be no note if no dependencies were added or modified for a given PR. If one or more dependencies are still processing (no results available), then the note will make that clear and the CI job will only fail if dependencies that have completed analysis results do not meet the active policy.
Prerequisites
The GitHub Actions environment is primarily supported through the use of a Docker image. The pre-requisites for using this image are:
- Ability to run a Docker container action
- GitHub-hosted runners must use an Ubuntu runner
- Self-hosted runners must use a Linux operating system and have Docker installed
- Access to the
phylum-dev/phylum-ci
Docker image from the GitHub Container Registry - A GitHub token with API access
- Can be the default
GITHUB_TOKEN
provided automatically at the start of each workflow run- Needs at least write access for
pull-requests
scope - see documentation
- Needs at least write access for
- Can be a personal access token (PAT) - see documentation
- Needs the
repo
scope or minimally thepublic_repo
scope if private repositories are not used
- Needs the
- Can be the default
- A Phylum token with API access
- Contact Phylum or register to gain access
- See also
phylum auth register
command documentation
- See also
- Consider using a bot or group account for this token
- Forked repos require the
pull_request_target
event, to allow secret access
- Contact Phylum or register to gain access
- Access to the Phylum API endpoints
- That usually means a connection to the internet, optionally via a proxy
- Support for on-premises installs are not available at this time
Supported Dependency Files
If not explicitly specified, an attempt will be made to automatically detect dependency files. These include both lockfiles and manifests. The basic difference is that manifests are where top-level dependencies are specified in their loose form while lockfiles contain the completely resolved collection of the abstract declarations from a manifest.
Some dependency file types (e.g., Python/pip requirements.txt
) are ambiguous in that they can be named differently
and may or may not contain strict dependencies. That is, they can be either a lockfile or a manifest. We call these
"lockifests." Some dependency files fail to parse as the expected lockfile type (e.g., pip
instead of poetry
for
pyproject.toml
manifests).
For these situations, the recommendation is to specify the path and lockfile type explicitly in a .phylum_project
file
at the root of the project repository. The easiest way to do that is with the Phylum CLI, using the
phylum init
command and committing the generated .phylum_project
file.
The Phylum Knowledge Base contains the list of currently supported lockfiles. It is also where information on lockfile generation can be found for current manifest file support.
Getting Started
Phylum analysis of dependencies can be added to existing CI workflows or on its own with this minimal configuration:
name: Phylum_analyze
on: pull_request
jobs:
analyze_deps:
name: Analyze dependencies with Phylum
permissions:
contents: read
pull-requests: write
runs-on: ubuntu-latest
steps:
- name: Checkout the repo
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Analyze dependencies
uses: phylum-dev/phylum-analyze-pr-action@v2
with:
phylum_token: ${{ secrets.PHYLUM_TOKEN }}
This configuration contains a single job, with two steps, that will only run on pull request events.
It does not override any of the phylum-ci
arguments, which are all either optional or default to secure values.
Let's take a deeper dive into each part of the configuration:
Workflow and Job names
The workflow and job names can be named differently or included in existing workflows/jobs.
name: Phylum_analyze # Name the workflow what you like
on: pull_request
jobs:
analyze_deps: # Name the job what you like
name: Analyze dependencies with Phylum # This name is optional (defaults to job name)
Workflow trigger
The Phylum Analyze PR action expects to be run in the context of a pull_request
webhook event.
This includes both pull_request
and pull_request_target
events.
# NOTE: These are examples. Only one definition for `on` is expected.
# Specify the `pull_request` event trigger on one line
on: pull_request
# Alternative to specify `pull_request` trigger (e.g., when other triggers are present)
on:
pull_request:
# Specify specific branches for the `pull_request` trigger to target
on:
pull_request:
branches:
- main
- develop
Allowing pull requests from forked repositories requires using the pull_request_target
event since the Phylum API
key is stored as a secret and the pull_request
event does not provide access to secrets when the PR comes from a
fork.
on:
pull_request:
# Allow PRs from forked repos to access secrets, like the Phylum API key
pull_request_target:
⚠️ WARNING⚠️ Using the
pull_request_target
event for forked repositories requires additional configuration when checking out the repo. Be aware that such a configuration has security implications if done improperly. Attackers may be able to obtain repository write permissions or steal repository secrets. Please take the time to understand and mitigate the risks:
- GitHub Security Lab: "Preventing pwn requests"
- GitGuardian: "GitHub Actions Security Best Practices"
Minimal suggestions include:
- Use a separate workflow for the Phylum Analyze PR action
- Do not provide access to any secrets beyond the Phylum API key
- Limit the steps in the job to two: checking out the PR's code and using the Phylum action
Permissions
When using the default GITHUB_TOKEN
provided automatically at the start of each workflow run, it is good practice to
ensure the actions used in the workflow are given the least privileges needed to perform their intended function.
The Phylum Analyze PR actions needs at least write access for the pull-requests
scope.
The actions/checkout
action needs at least read access for the contents
scope.
See the GitHub documentation for more info.
permissions: # Ensure least privilege of actions
contents: read # For actions/checkout
pull-requests: write # For phylum-dev/phylum-analyze-pr-action
When using a personal access token (PAT) instead, the token should be created with the repo
scope or
minimally the with public_repo
scope if private repositories will not be used with the PAT.
See the GitHub documentation for more info.
permissions: # Ensure least privilege of actions
contents: read # For actions/checkout
# The phylum-dev/phylum-analyze-pr-action does not
# need the `pull-requests` scope here if using a PAT
Specifying a Runner
The Phylum Analyze PR action is a Docker container action. This requires that GitHub-hosted runners use an Ubuntu runner. Self-hosted runners must use a Linux operating system and have Docker installed.
runs-on: ubuntu-latest
Checking out the Repository
git
is used within the phylum-ci
package to do things like determine if there was a dependency file change and,
when specified, report on new dependencies only. Therefore, a clone of the repository is required to ensure that
the local working copy is always pristine and history is available to pull the requested information.
steps:
- name: Checkout the repo
uses: actions/checkout@v4
with:
# Specifying a depth of 0 ensures all history for all branches.
# This input may not be required when `--all-deps` option is used.
fetch-depth: 0
Allowing pull requests from forked repositories requires using the pull_request_target
event
and checking out the head of the forked repository:
steps:
- name: Checkout the repo
uses: actions/checkout@v4
with:
fetch-depth: 0
# Specifying the head of the forked repository's PR branch
# is required to get any proposed dependency file changes.
ref: ${{ github.event.pull_request.head.sha }}
⚠️ WARNING⚠️ Using the
pull_request_target
event for forked repositories and checking out the pull request's code has security implications if done improperly. Attackers may be able to obtain repository write permissions or steal repository secrets. Please take the time to understand and mitigate the risks:
- GitHub Security Lab: "Preventing pwn requests"
- GitGuardian: "GitHub Actions Security Best Practices"
Minimal suggestions include:
- Use a separate workflow for the Phylum Analyze PR action
- Do not provide access to any secrets beyond the Phylum API key
- Limit the steps in the job to two: checking out the PR's code and using the Phylum action
Action Inputs
The action inputs are used to ensure the phylum-ci
tool is able to perform its job.
A Phylum token with API access is required to perform analysis on project dependencies.
Contact Phylum or register to gain access.
See also phylum auth register
command documentation and consider
using a bot or group account for this token.
A GitHub token with API access is required to use the API (e.g., to post comments).
This can be the default GITHUB_TOKEN
provided automatically at the start of each workflow run but it will need at
least write access for the pull-requests
scope (see documentation).
Alternatively, it can be a personal access token (PAT) with the repo
scope or minimally the public_repo
scope, if private repositories are not used.
The values for the phylum_token
and github_token
action inputs can come from repository, environment, or
organizational encrypted secrets.
Since they are sensitive, care should be taken to protect them appropriately.
The cmd
arguments to the Docker image are the way to exert control over the execution of the Phylum analysis. The
phylum-ci
script entry point is expected to be called. It has a number of arguments that are all optional and
defaulted to secure values. To view the arguments, their description, and default values, run the script with --help
output as specified in the Usage section of the phylum-dev/phylum-ci
repository's README or more simply
view the script options output for the latest release.
steps:
- name: Analyze dependencies
uses: phylum-dev/phylum-analyze-pr-action@v2
with:
# Contact Phylum (phylum.io/contact-us) or register (app.phylum.io/register) to gain access.
# See also `phylum auth register` (docs.phylum.io/cli/commands/phylum_auth_register) docs.
# Consider using a bot or group account for this token.
phylum_token: ${{ secrets.PHYLUM_TOKEN }}
# NOTE: These are examples. Only one `github_token` entry line is expected.
#
# Use the default `GITHUB_TOKEN` provided automatically at the start of each workflow run.
# This entry does not have to be specified since it is the default.
github_token: ${{ secrets.GITHUB_TOKEN }}
# Use a personal access token (PAT)
github_token: ${{ secrets.GITHUB_PAT }}
# NOTE: These are examples. Only one `cmd` entry line is expected.
#
# Use the defaults for all the arguments.
# The default behavior is to only analyze newly added dependencies against
# the active policy set at the Phylum project level.
# This entry does not have to be specified since it is the default.
cmd: phylum-ci
# Provide debug level output
cmd: phylum-ci -vv
# Consider all dependencies in analysis results instead of just the newly added ones.
# The default is to only analyze newly added dependencies, which can be useful for
# existing code bases that may not meet established policy rules yet,
# but don't want to make things worse. Specifying `--all-deps` can be useful for
# casting the widest net for strict adherence to Quality Assurance (QA) standards.
cmd: phylum-ci --all-deps
# Force analysis, even when no dependency file has changed. This can be useful for
# manifests, where the loosely specified dependencies may not change often but the
# completely resolved set of strict dependencies does.
cmd: phylum-ci --force-analysis
# Force analysis for all dependencies in a manifest file. This is especially useful
# for *workspace* manifest files where there is no companion lockfile (e.g., libraries).
cmd: phylum-ci --force-analysis --all-deps --depfile Cargo.toml
# Some lockfile types (e.g., Python/pip `requirements.txt`) are ambiguous in that
# they can be named differently and may or may not contain strict dependencies.
# In these cases it is best to specify an explicit path, either with the `--depfile`
# option or in a `.phylum_project` file. The easiest way to do that is with the
# Phylum CLI, using the `phylum init` (https://docs.phylum.io/cli/commands/phylum_init)
# command and committing the generated `.phylum_project` file.
cmd: phylum-ci --depfile requirements-prod.txt
# Specify multiple explicit dependency file paths
cmd: phylum-ci --depfile requirements-prod.txt path/to/dependency.file
# Install a specific version of the Phylum CLI.
cmd: phylum-ci --phylum-release 4.8.0 --force-install
# Mix and match for your specific use case.
cmd: |
phylum-ci \
-vv \
--depfile requirements-dev.txt \
--depfile requirements-prod.txt path/to/dependency.file \
--depfile Cargo.toml \
--force-analysis \
--all-deps
Example Comments
Phylum OSS Supply Chain Risk Analysis - FAILED
Phylum OSS Supply Chain Risk Analysis - INCOMPLETE WITH FAILURE
Phylum OSS Supply Chain Risk Analysis - INCOMPLETE
Phylum OSS Supply Chain Risk Analysis - SUCCESS
Alternatives
The default phylum-ci
Docker image contains git
and the installed phylum
Python package. It also contains an
installed version of the Phylum CLI and all required tools needed for lockfile generation.
An advantage of using the default Docker image is that the complete environment is packaged and made available with
components that are known to work together.
One disadvantage to the default image is its size. It can take a while to download and may provide more tools than
required for your specific use case. Special slim
tags of the phylum-ci
image are provided as an alternative.
These tags differ from the default image in that they do not contain the required tools needed for
lockfile generation (with the exception of the pip
tool). The slim
tags are significantly
smaller and allow for faster action run times. They are useful for those instances where no manifest files are
present and/or only lockfiles are used.
Using the slim image tags is possible by altering your workflow to use the image directly instead of this GitHub Action. That is possible with either container jobs or container steps.
Container Jobs
GitHub Actions allows for workflows to run a job within a container, using the container:
statement in the workflow
file. These are known as container jobs. More information can be found in GitHub documentation:
"Running jobs in a container". To use a slim
tag in a container job, use this minimal configuration:
name: Phylum_analyze
on: pull_request
jobs:
analyze_deps:
name: Analyze dependencies with Phylum
permissions:
contents: read
pull-requests: write
runs-on: ubuntu-latest
container:
image: docker://ghcr.io/phylum-dev/phylum-ci:slim
env:
GITHUB_TOKEN: ${{ github.token }}
PHYLUM_API_KEY: ${{ secrets.PHYLUM_TOKEN }}
steps:
- name: Checkout the repo
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Analyze dependencies
run: phylum-ci -vv
The image:
value is set to the latest slim image, but other tags are available to ensure a specific release of the
phylum-ci
project and a specific version of the Phylum CLI. The full list of available phylum-ci
image tags can be
viewed on GitHub Container Registry (preferred) or Docker Hub.
The GITHUB_TOKEN
and PHYLUM_API_KEY
environment variables are required to have those exact names. The rest of the
options are the same as already documented.
Container Steps
GitHub Actions allows for workflows to run a step within a container, by specifying that container image in the uses:
statement of the workflow step. These are known as container steps. More information can be found in
GitHub workflow syntax documentation. To use a slim
tag in a container step, use this minimal
configuration:
name: Phylum_analyze
on: pull_request
jobs:
analyze_deps:
name: Analyze dependencies with Phylum
permissions:
contents: read
pull-requests: write
runs-on: ubuntu-latest
steps:
- name: Checkout the repo
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Analyze dependencies
uses: docker://ghcr.io/phylum-dev/phylum-ci:slim
env:
GITHUB_TOKEN: ${{ github.token }}
PHYLUM_API_KEY: ${{ secrets.PHYLUM_TOKEN }}
with:
args: phylum-ci -vv
The uses:
value is set to the latest slim image, but other tags are available to ensure a specific release of the
phylum-ci
project and a specific version of the Phylum CLI. The full list of available phylum-ci
image tags can be
viewed on GitHub Container Registry (preferred) or Docker Hub.
The GITHUB_TOKEN
and PHYLUM_API_KEY
environment variables are required to have those exact names. The rest of the
options are the same as already documented.
FAQs
Why does Phylum report a failing status check if it shows a successful analysis comment?
It is possible to get a successful Phylum analysis comment on the PR and also have the Phylum action report a failing status check. This happens when one or more dependency files fails the filtering process while at least one dependency file passes the filtering process and the Phylum analysis.
The failing status check is meant to serve as an indication to the repository owner that an issue exists with at least one of the dependency files submitted, whether they intended it or not. The reasoning is that it is better to be explicit about possible failures, allowing for review of the logs and correction, than to silently ignore the failure and possibly allow untrusted code into the repository.
There are several reasons a dependency file may fail the filtering process and each failure will be included in the logs
as a warning. The file may not exist or it may exist, but only as an empty file. The file may fail to be parsed by
Phylum. The dependency files can be manifests or lockfiles and they can either be provided explicitly or automatically
detected when not provided. Sometimes the automatic detection will misattribute a file as a manifest or assign the wrong
lockfile type. As detailed in the "Supported Dependency Files" section, the
recommendation for this situation is to specify the path and lockfile type explicitly in a .phylum_project
file at
the root of the project repository.
Why does analysis fail for PRs from forked repositories?
Another reason why Phylum reports
failing status checks is for
pull_request_target
events where manifests are provided. Using pull_request_target
events for forked repositories
has security implications if done improperly. Attackers may be able to obtain repository write permissions or steal
repository secrets. A more comprehensive enumeration of the risks can be found here:
- GitHub Security Lab: "Preventing pwn requests"
- GitGuardian: "GitHub Actions Security Best Practices"
This GitHub action disables lockfile generation to prevent arbitrary code execution in an untrusted context, like PRs from forks. This means that provided manifests are unable to be parsed by Phylum since parsing first requires generating a lockfile from the manifest. A unique error code and warning message is provided so as to better signal the implication: the resolved dependencies from the manifest have NOT been analyzed by Phylum. Care should be taken to inspect changes manually before allowing a manifest to be used in a trusted context.
License
Copyright (C) 2022 Phylum, Inc.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program.
If not, see https://www.gnu.org/licenses/gpl.html or write to phylum@phylum.io
or engineering@phylum.io
Contributing
Suggestions and help are welcome. Feel free to open an issue or otherwise contribute. More information is available on the contributing documentation page.
Code of Conduct
Everyone participating in the phylum-analyze-pr-action
project, and in particular in the issue tracker and pull
requests, is expected to treat other people with respect and more generally to follow the guidelines articulated in the
Code of Conduct.
Security Disclosures
Found a security issue in this repository? See the security policy for details on coordinated disclosure.