How to use the cache on Bitbucket Pipelines

Original Japanese article is here:


  • The only thing the cache does is "extracting stored directories at the beginning of pipelines".
  • So if you want to utilize the cache, you must write a script which does "use cache if exists, otherwise download it".

What I wanted to do

Bitbucket Pipelines has "cache" feature.
What I wanted to do is "speed up building library with cache, which requires make && make install", but the official document, in particular, about custom cache is hard for me and I struggled with it.

Use Docker image if possible

You can use any Docker image in Bitbucket Pipelines as build base image.
Therefore, use docker build caching feature if you can either:

  • make the development environment's docker image public (e.g. Dockerhub)
  • pull images from a private registry (e.g. your own docker registry)

The rest of this article is for people who are not permitted publishing docker images or want to cache workflow which is not suitable for docker.

How to use Pipelines cache

In this article, I tried to cache the build of MeCab+IPAdic. 1

My final repository is like:


I will describe each file one by one.


image: python:3.6.4

    - step:
          - pip
          - mecab
          - mecab-ipadic
          - bash
          - (cd ~/mecab/mecab/mecab-0.996 && make install && ldconfig)
          - bash
          - (cd ~/mecab/mecab-ipadic/mecab-ipadic-2.7.0-20070801 && make install)
          - pip install -r requirements.txt
          - python test  # your test here

    mecab: ~/mecab/mecab
    mecab-ipadic: ~/mecab/mecab-ipadic

bash in the script section does
"check cache of already-built mecab. If not exists, download and make it."
The succeeding line invokes make install mecab.

bash is almost same.

After running above make installs, I run the test (python test) which is the actual target of this pipeline.

mkdir -p ${CACHE_DIR}
if [ -d "${CACHE_DIR}/mecab-0.996" ]; then
  echo "found mecab cache"
  wget "" -O mecab-0.996.tar.gz
  tar zxfv mecab-0.996.tar.gz -C ${CACHE_DIR}/
  cd ${CACHE_DIR}/mecab-0.996
  ./configure --with-charset=utf8

This script, as previously described, does
"check cache of already-built mecab (at the first if statement). If not exists, download and make it."

./configure and make is also cached because they do not matter until I change the base image.
(the remaining make install is a light operation because they almost always just copy some files.)2

The "cache" is, at the beginning of the Pipelines, extracted from the directories stored when Pipelines finishes successfully at once.
In this case, step: caches: and definitions: caches: sections in bitbucket-pipelines.yml specify the cache directory.
(The definition of pip is missing because commonly used package managers' cache definition is pre-defined by Pipelines.)

mkdir -p ${CACHE_DIR}
if [ -d "${CACHE_DIR}/mecab-ipadic-2.7.0-20070801" ]; then
  echo "found ipadic cache"
  wget "" -O mecab-ipadic-2.7.0-20070801.tar.gz
  tar zxfv mecab-ipadic-2.7.0-20070801.tar.gz -C ${CACHE_DIR}/
  cd ${CACHE_DIR}/mecab-ipadic-2.7.0-20070801
  ./configure --with-charset=utf8

Do the same thing for IPAdic.

Speed up


In my case the final trial reduces the building time 1/3.
(pip install required 26 sec, which is almost half of the building time (51 sec). Caching download does not matter when the installation operation is heavy.
As described previously, creating docker image (if possible) is a better solution, in my opinion.)


  1. MeCab is a Japanese language tokenizer. IPAdic is its main dictionary. 

  2. Because whole cache directories will be restored and the last make result will be there, running make every time may not affect the building time thanks to the Makefile's redundancy detection feature.