Pre Commit
Pre-commit is a code quality control tool that allows you to run automatic checks before committing your changes to the version control system (like Git). The main idea of pre-commit is that it runs a series of "hooks" automatically every time you try to make a commit, checking if the code meets the defined standards. If any check fails, the commit is blocked until the problems are fixed.
Main characteristics of pre-commit:
- It's configured through a YAML file
.pre-commit-config.yaml. - Allows defining multiple hooks for different types of checks.
- Can be used to check code formatting, linting, tests, vulnerabilities, and much more.
- Works with various programming languages.
- Is extensible with custom hooks.
Once configured, pre-commit automatically runs the defined hooks whenever you try to make a commit.
It's necessary for the configuration file to be prepared so that pre-commit knows what to do.
Installationβ
We can install pre-commit in several ways. The official documentation shows how to install python pip, but I prefer to use the operating system package manager.
## linux
sudo apt-get install pre-commit
## mac
brew install pre-commit
pre-commit --version
pre-commit 4.2.0
To enable pre-commit in the repository we're working on, we execute the install command.
pre-commit install
pre-commit installed at .git/hooks/pre-commit
This command configures Git to automatically run pre-commit hooks whenever you try to make a commit, but how does this happen?
-
Creates a Git hook: It creates (or replaces) the .git/hooks/pre-commit file in your local repository. This is an executable script that Git will call automatically before finalizing each commit.
-
Configures the verification pipeline: The installed script is configured to read your .pre-commit-config.yaml file and run all checks defined there.
-
Establishes the workflow: From this moment on, whenever you run git commit, Git will first run the pre-commit hook, which in turn will run all configured checks.
-
Automates quality verification: If any of the checks fail, the commit will be aborted, forcing you to fix the problems before you can commit the code.
This command "connects" the pre-commit tool to the Git commit process, ensuring that your code quality checks are run automatically, without you having to remember to run them manually for each commit.
To uninstall just run the command pre-commit uninstall and the hook will be removed from .git
Local vs Pipelineβ
When you run pre-commit install, the installation is only local and doesn't reflect in your remote repository. This happens because:
- The pre-commit install command only modifies your local repository's .git/hooks/ folder.
- By default, the .git/ folder and its contents are not versioned - they are ignored by Git.
- Each developer needs to run pre-commit install on their own local copy of the repository.
We can version the .pre-commit-config.yaml file to your repository, but it will be necessary to document in the project README that developers need to run pre-commit install after cloning the repository.
You can have the .pre-commit-config.yaml file versioned in the repository and not use it in pipelines, only for local use which allows you to do your own tests.
The big problem with this is that the collaborator MAY or MAY NOT follow the documentation. To avoid this, put pre-commit in your pipelines and show the collaborator that if they don't install to ensure local commit success, it also won't pass in the pipeline.
The advantage of using pre-commit in all processes, local and pipeline, is:
Consistency: Keeps code consistent throughout the projectAutomation: Reduces manual verification work.Education: Teaches best practices to the team.Productivity: Identifies errors before sending them to the remote repository
Among the items above I see greater importance in the EDUCATION aspect. Forcing use ensures your team works correctly and learns the right working method.
First Hooksβ
As we saw, the hooks to be executed will be defined inside the .pre-commit-config.yaml file.
We have an initial template to start with our hooks.
pre-commit sample-config
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.2.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
# Saving the samples
pre-commit sample-config > .pre-commit-config.yaml
The hooks listed in this configuration (trailing-whitespace, end-of-file-fixer, check-yaml, and check-added-large-files) are all part of the pre-commit default hooks repository https://github.com/pre-commit/pre-commit-hooks, and pre-commit will download and configure them automatically on first run.
A list of some more relevant ones from the pre-commit-hooks repository.
-
check-added-large-files: Prevents large files from being committed- id: check-added-large-files
args: ['--maxkb=1024'] #(default=500kB) -
check-executables-have-shebangs: Checks if non-binary executables have a proper shebang. -
check-json: Attempts to load all json files to verify syntax. -
check-yaml: Attempts to load all yaml files to verify syntax. -
check-merge-conflict: Check for files that contain merge conflict character sequences. --assume-in-merge- Allows running the hook when there is no merge operation in progress -
check-symlinks: Checks for symlinks that don't point to anything, better known as shortcuts. -
detect-aws-credentials: Examines files for patterns that match AWS credentials and compares them with your own credentials configured in the AWS CLI, generating failure if it finds matches. -
detect-private-key: Detects if there are private key files in the commit. -
double-quote-string-fixer: Replaces double-quoted strings with single-quoted strings. -
end-of-file-fixer: Ensures files end with only one newline. -
trailing-whitespace: Removes trailing whitespace. To preserve Markdown hard line breaks...- id: trailing-whitespace
args: ['--markdown-linebreak-ext=md'] -
no-commit-to-branch: This hook prevents direct commits to specific branches, usually the main ones of your project (like main or master). This helps to:- Force the use of Pull/Merge Requests for all changes to main branches.
- Prevent accidental changes in production environments.
- Ensure all changes go through the code review process.
- Maintain the quality and integrity of protected branches.
- id: no-commit-to-branch
args: ['--branch', 'main', '--branch', 'develop', '--branch', 'release']We can even configure to allow commit only to specific branches. Great for using gitflow.
- id: no-commit-to-branch
args: ['--pattern', '^(feature|bugfix|hotfix)/']
And how would our initial config look then? Note that the sample didn't bring the most recent version of the repository and we've already replaced it with the newest one.
## Here we have global items that we'll talk about later, repos is one of them
#...
##
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0 # Use the most recent available version
hooks:
- id: check-added-large-files
args: ['--maxkb=1024'] # Allows files up to 1MB (default is 500kB)
- id: check-executables-have-shebangs
- id: check-json
- id: check-yaml
- id: check-merge-conflict
args: ['--assume-in-merge'] # Allows running when there's no merge operation
- id: check-symlinks
- id: detect-aws-credentials
args: ['--credentials-file=~/.aws/credentials', '--allow-missing-credentials']
- id: detect-private-key
- id: double-quote-string-fixer
- id: end-of-file-fixer
- id: trailing-whitespace
args: ['--markdown-linebreak-ext=md'] # Preserves hard line breaks in Markdown files
- id: no-commit-to-branch
args: ['--branch', 'main', '--branch', 'develop', '--branch', 'release']
And where are these hooks? Do we reference directly what we have in the repository? Who runs these hooks?
When we use the repository we're just pointing to the project we're going to use, but actually this repository is cloned to your machine locally and will go somewhere we'll see ahead.
To test, follow the flow normally to make a commit in your repository.
git add -A
git commit -m "test pre-commit"
## And you'll see it running... but it will probably fail.
If it fails, notice that it corrected some things and some files suffered modifications. That's why it's necessary to redo the git add and git commit commands again to add the modifications that were made.
Environmentβ
Pre-commit is a hooks orchestrator and nothing more. When we point to a repository it already has everything needed to run the hook. We're actually pointing to the repository that it will clone.
When you run a commit, this orchestrator reads your .pre-commit-config.yaml file. On first run, it automatically downloads the specified hooks from their remote repositories and creates isolated environments (usually virtualenvs) for each downloaded hook. When necessary, it installs the hook dependencies in this isolated environment.
By passing this repo: https://github.com/pre-commit/pre-commit-hooks pre-commit will clone this repository into a folder on your system.
# mac and linux
~/.cache/pre-commit
# windows
C:\Users\<your-user>\AppData\Local\pre-commit
Since I already use pre-commit in other projects I have more repositories than just pre-commit-hook.
~/.cache/pre-commit
β― tree -L 1 .
.
βββ README
βββ db.db
βββ patch1742487237-94180
βββ patch1742489491-15045
βββ patch1742578081-54469
βββ pre-commit.log
βββ repo8nab35wd
βββ repo_bvmclu5
βββ repolgpjw5vx
βββ repomto3fe10
βββ repon25rfbaz
βββ repouta6jw6j
One of these folders is the clone of repo https://github.com/pre-commit/pre-commit-hooks (repouta6jw6j).
We can already see that the project actually uses python to run the scripts of each of the hooks. Other projects may use other methods.
~/.cache/pre-commit/repouta6jw6j
β― ls
CHANGELOG.md build py_env-python3.12 setup.cfg tests
LICENSE pre_commit_hooks py_env-python3.13 setup.py tox.ini
README.md pre_commit_hooks.egg-info requirements-dev.txt testing
And we can see that we have the pre_commit_hooks folder with the python script that's run when we pass the id.
β― ls pre_commit_hooks
__init__.py check_toml.py forbid_new_submodules.py
check_added_large_files.py check_vcs_permalinks.py mixed_line_ending.py
check_ast.py check_xml.py no_commit_to_branch.py
check_builtin_literals.py check_yaml.py pretty_format_json.py
check_byte_order_marker.py debug_statement_hook.py removed.py
check_case_conflict.py destroyed_symlinks.py requirements_txt_fixer.py
check_docstring_first.py detect_aws_credentials.py sort_simple_yaml.py
check_executables_have_shebangs.py detect_private_key.py string_fixer.py
check_json.py end_of_file_fixer.py tests_should_end_in_test.py
check_merge_conflict.py file_contents_sorter.py trailing_whitespace_fixer.py
check_shebang_scripts_are_executable.py fix_byte_order_marker.py util.py
check_symlinks.py fix_encoding_pragma.py
check_toml.py forbid_new_submodules.py
What was executed was practically this for each of the hooks
python /path/to/.cache/pre-commit/repoXYZ/hook_script.py [arguments]
If you don't have python, how did it run? Actually the executable came along with the repository.
~/.cache/pre-commit/repouta6jw6j
β― ls py_env-python3.13/bin/python
py_env-python3.13/bin/python
Local Vs Repoβ
The repo parameter doesn't need to point to a repository url and can be defined as local. In this case we'll pass the parameters we want for this.
- repo: local
hooks:
- id: custom-script
name: Custom local script
entry: ./scripts/validate.sh
language: script
- id: trivy
name: Run trivy
entry: trivy config . --exit-code 1 --severity HIGH,CRITICAL --skip-dirs "tests" --timeout 5m
language: system
pass_filenames: false
verbose: true
Global Parametersβ
We didn't pass any global parameters, but it's possible. These global configuration blocks define default values that apply to all hooks in your .pre-commit-config.yaml file, unless overridden in specific hook settings. Below are the default values that we don't need to define but it's worth knowing in case you need them.
### GLOBAL BLOCK ####
# Defines which types of hooks will be installed by default when you run pre-commit install. In this case, only pre-commit type hooks will be installed.
default_install_hook_types:
- "pre-commit"
# Defines default language versions for different hooks. The empty object means system default versions will be used.
default_language_version: {}
# Defines at which git stages each hook will run by default. This configuration includes all possible stages which is the default. Doesn't mean it will run in all stages, only in the ones we install in this case pre-commit. We'll talk about this next.
default_stages:
- "commit-msg"
- "post-checkout"
- "post-commit"
- "post-merge"
- "post-rewrite"
- "pre-commit"
- "pre-merge-commit"
- "pre-push"
- "pre-rebase"
- "prepare-commit-msg"
# Defines a global pattern of files that will be checked. Empty string means there's no additional global filter.
files: ''
# Defines a regex pattern to exclude files. ^$ is a regex that doesn't match any file (only matches empty strings).
exclude: ^$
# If set to true, pre-commit will stop at the first hook failure. Since it's false, all hooks will run even if some fail.
fail_fast: false
# Minimum pre-commit version needed to run this configuration.
minimum_pre_commit_version: '0'
####################
repos:
#... Continues...
Generally we see more use of pre-commit in the pre-commit stage, but we could install hooks for several stages. Actually the name pre-commit confuses a bit thinking it's only limited to its stage.
These are the possible stages.
pre-commit: Before finalizing a commit messagecommit-msg: To validate the commit messagepost-checkout: After a checkoutpost-commit: After completing a commitpost-merge: After completing a mergepost-rewrite: After commands that rewrite commits (like rebase)pre-merge-commit: Before a merge commitpre-push: Before a pushpre-rebase: Before a rebaseprepare-commit-msg: Preparing the default commit message
If we were to install for more stages it would be this example:
# installing hooks for pre-push and commit-msg
pre-commit install --hook-type pre-commit --hook-type pre-push --hook-type commit-msg
If we had this in .pre-commit-config.yaml only the pre-commit install command would do all the work.
default_install_hook_types:
- "pre-commit"
- "pre-push"
- "commit-msg"
Main Hook Parametersβ
Here are the main parameters available for configuring hooks in pre-commit:
id: Unique identifier of the hook within the repository (or local).name (optional): Custom name for the hook, which will be displayed during execution.entry: Entry point for the hook. Can be a script, module, or command.languageLanguage used by the hook. Some possible values: python, node, ruby, rust, dotnet, perl conda, system (uses system commands).files: Regular expression that defines which files should be analyzed by the hook.exclude: Regular expression that defines which files should be ignored by the hook.types,types_orandtypes_and: Filters files by type.exclude_types: File types to be excluded.args: List of arguments to be passed to the hook.stagesDefines at which stages the hook should run: commit, merge-commit, push, prepare-commit-msg, commit-msg, post-checkout, post-commit, post-merge, post-rewrite, manual (only when called manually).additional_dependenciesList of additional dependencies needed for the hook.always_run: Defines if the hook should run even when there are no matching files.pass_filenames: Defines if file names should be passed to the hook.fail_fast: If true, fails on first error occurrence.verbose: If true, produces more detailed output.require_serialIf true, runs the hook serially (not parallel).descriptionDescription of what the hook does.minimum_pre_commit_versionMinimum version of pre-commit needed to run the hook.
There's no default value for hook parameters. These parameters may vary according to each hook in the repository. When using a hook it's necessary to check the documentation. Generally, pre-commit hooks preserve the following values as default, without explicitly specifying them in their configurations:
## Hooks normally explicitly specify only:
- id:
name:
entry:
language:
files:
types:
################################################
##### Values normally kept default ######
# No specific exclusion pattern
exclude: ^$
types_or: []
types_and: []
exclude_types: []
# Frequently left empty or with minimal description
description: ''
# No additional arguments by default
args: []
# Only run in pre-commit phase by default
stages: [pre-commit]
# No additional dependencies
additional_dependencies: []
# Only run when relevant files are modified
always_run: false
# Receive modified file names as arguments
pass_filenames: true
# Most hooks don't stop execution after a failure
fail_fast: false
# Don't display detailed information by default
verbose: false
# Can be run in parallel with other hooks
require_serial: false
#################################################
Just a detail...
#### THIS IS VALID, BUT IS REDUNDANT.
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0 # Use the most recent available version
hooks:
- id: check-json
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0 # Use the most recent available version
hooks:
- id: check-yaml
It would be the same as writing this.
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0 # Use the most recent available version
hooks:
- id: check-json
- id: check-yaml
If you want to validate the configuration file.
# I forced an error to check.
β― pre-commit validate-config .pre-commit-config.yaml
==> File .pre-commit-config.yaml
==> At Config()
==> At key: repos
==> At Repository(repo='https://github.com/pre-commit/pre-commit-hooks')
=====> Missing required key: hooks
Evolving Furtherβ
Of course the hooks we're going to run depend a lot on the project. If we're developing a terraform module it would be interesting to run terraform fmt, terraform validate, terraform test if there are tests, checkov for static security checking, etc.
Let's talk about this configuration to expand our horizons. This could be a configuration for a terraform project or module.
repos:
# Already know...
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: check-merge-conflict
- id: end-of-file-fixer
exclude: "README.md"
- id: trailing-whitespace
# This repo has several hooks for terraform let's use them
- repo: https://github.com/antonbabenko/pre-commit-terraform
rev: v1.96.1
hooks:
- id: terraform_fmt
files: ^[^/]+\.tf$
exclude: ^examples/
- id: terraform_tflint
exclude: ^examples/
args: # Several arguments passing to tflint that's in the repository
- '--args=--only=terraform_deprecated_interpolation'
- '--args=--only=terraform_deprecated_index'
- '--args=--only=terraform_unused_declarations'
- '--args=--only=terraform_comment_syntax'
- '--args=--only=terraform_documented_outputs'
- '--args=--only=terraform_documented_variables'
- '--args=--only=terraform_typed_variables'
- '--args=--only=terraform_module_pinned_source'
- '--args=--only=terraform_naming_convention'
- '--args=--only=terraform_required_version'
- '--args=--only=terraform_required_providers'
- '--args=--only=terraform_standard_module_structure'
- '--args=--only=terraform_workspace_remote'
- id: terraform_validate
exclude: ^examples/
# However the terraform test command wasn't implemented in this repository and we'll solve this locally
- repo: local
hooks:
- id: terraform-test
name: Run Terraform Test
entry: terraform test # The command executed will be terraform test with no arguments
language: system
pass_filenames: false
- id: checkov # And we'll also use checkov for static code analysis
name: Run checkov
## the entry here was passed with arguments, we could have separated <<<<<<
entry: checkov -d . --skip-path "examples" --skip-path "tests" --quiet
language: system
pass_filenames: false
verbose: true
# We'll also generate documentation for this terraform project
- repo: https://github.com/terraform-docs/terraform-docs
rev: "v0.19.0"
hooks:
- id: terraform-docs-go
args: ["--output-mode", "replace", "--output-file", "README.md", "."]
The last step was unnecessary as the antonbabenko/pre-commit-terraform repo itself has this hook.
For this pre-commit to work we'll need to have checkov available in the system path as there's no repository bringing this binary to us.
Other Hooksβ
If we navigate to https://pre-commit.com/hooks.html we can find more hooks that can be part of your project.
Several commands we can put in hooks can be run locally.
Popular hooks:
- Code formatters: black, prettier, autopep8
- Linters: flake8, eslint, pylint
- Security checkers: bandit, safety
- Style checkers: isort, yapf
- Syntax checkers: check-json, check-yaml
Useful Commandsβ
pre-commit run: Runs hooks manuallypre-commit run check-yaml --all-files: Runs this hook for all filespre-commit run --all-files: Runs on all filespre-commit autoupdate: Updates hooks to the most recent versions
Pre-commit is a valuable tool for development teams that want to maintain high code quality and standardization in their projects.