Build a MAC (Mojave - macOS 10.14.5)

With each new build of the MAC OS we try to keep an update of how the machines are built out for others in the lab and the community to follow. Some of these steps are overkill internally because of our high performance computing resources but it is still nice to test and play locally every once and a while

Build macOS 10.14.5 MacBook Pro 2018 (Should work with or without touchbar)

1) Setup Basic Environment

Make a Local Directory and Binary Directory

# Open a new terminal window, this ensures the current directory is your $HOME directory

mkdir -p local/bin

Create a Profile and set $HOME/local/bin as a path directory

# Open profile file (a hidden file that tells terminal what to do)

vim .profile

# add the following lines to profile (press "I" to enter insert mode) [see vim page on vim usage]

# This ensures when a vim window closes the edited text disappears
export TERM=xterm
############### 
## Define PATH locations 
###############
export PATH=$HOME/local/bin:$PATH
###############
## Configure Window Naming Function
###############
#Window Naming Function
function winname {
    printf "\e]2;$1\a"
    }
#Tab Naming Function
function tabname {
    printf "\e]1;$1\a"
    }
###############
## Alias
###############
# This alias allows you to type "ll" in the terminal to execute "ls -lh"
alias ll='ls -lh'
# Remote connections
# These alias's are examples that open connections to a computation resource (standard or ftp connections shown)
alias computer='ssh jkeats@computer.tgen.org'
alias cftp='sftp jkeats@computer.tgen.org' 

# save updated profile (press "esc" then ":w" then ":q"

2) Install Xcode

This is software provided by Apple that is used for building applications for Mac computers or iOS devices. It is an excellent code editor but more importantly it contains the application "make" that you will need to install most ngs applications.

a) Open the App Store

b) Search for Xcode

c) Follow the prompts to install (WARNING - IT is big and will take a while, even on a fast connection)

3) Install R

R is free software that supports a multitude of statistical and graphic applications. To download, search google for "R download" and follow the links to a CRAN download page of your choosing.

Download by clicking the "Download R for (Mac) OS X" link to the pre-compiled binary. (current version is R-3.6.0, Planting of a Tree, 2019-04-26).

To install open the install package (R-3.6.0.pkg) and follow the prompts.

Once installed open the application (Applications>R). Then install the following packages.

# Tidyverse (this adds most of the common tools/applications)
> install.packages("tidyverse")
  also installing the dependencies ‘colorspace’, ‘sys’, ‘ps’, ‘highr’, ‘markdown’, ‘xfun’, ‘zeallot’, ‘labeling’, ‘munsell’, ‘RColorBrewer’, ‘askpass’, ‘rematch’, ‘prettyunits’, ‘processx’, ‘knitr’, ‘yaml’, ‘htmltools’, ‘evaluate’, ‘base64enc’, ‘tinytex’, ‘utf8’, ‘vctrs’, ‘backports’, ‘generics’, ‘reshape2’, ‘assertthat’, ‘glue’, ‘pkgconfig’, ‘R6’, ‘Rcpp’, ‘tidyselect’, ‘BH’, ‘plogr’, ‘DBI’, ‘ellipsis’, ‘digest’, ‘gtable’, ‘lazyeval’, ‘plyr’, ‘scales’, ‘viridisLite’, ‘withr’, ‘curl’, ‘mime’, ‘openssl’, ‘clipr’, ‘cellranger’, ‘progress’, ‘callr’, ‘fs’, ‘rmarkdown’, ‘whisker’, ‘selectr’, ‘stringi’, ‘fansi’, ‘pillar’, ‘broom’, ‘cli’, ‘crayon’, ‘dplyr’, ‘dbplyr’, ‘forcats’, ‘ggplot2’, ‘haven’, ‘hms’, ‘httr’, ‘jsonlite’, ‘lubridate’, ‘magrittr’, ‘modelr’, ‘purrr’, ‘readr’, ‘readxl’, ‘reprex’, ‘rlang’, ‘rstudioapi’, ‘rvest’, ‘stringr’, ‘tibble’, ‘tidyr’, ‘xml2’
# Cowplot (formatting for ggplot outputs and panels, yes its a love/hate thing)
> install.packages("cowplot")
# gridExtra
> install.packages("gridExtra")
# gdata
> install.packages("gdata")
  also installing the dependency ‘gtools’
# data.table
> install.packages("data.table")
# maptools
> install.packages("maptools")
  also installing the dependency ‘sp’
# maps
> install.packages("maps")
# PBSmapping
> install.packages("PBSmapping")
# psych (provides a geometric mean method)
> install.packages("psych")
  also installing the dependency ‘mnormt’
# Hmisc
> install.packages("Hmisc")
  also installing the dependencies ‘checkmate’, ‘htmlwidgets’, ‘Formula’, ‘latticeExtra’, ‘acepack’, ‘htmlTable’, ‘viridis’

# TOOLS FOR MAKING VENN DIAGRAMS
> install.packages("VennDiagram")
  also installing the dependencies ‘formatR’, ‘lambda.r’, ‘futile.options’, ‘futile.logger’
> install.packages("bvenn")
> install.packages("colorfulVennPlot")
> install.packages("venn")
> install.packages("venneuler")
  also installing the dependency ‘rJava’
> install.packages("UpSetR")

# TOOLS FOR SURVIVAL ANALYSIS
> install.packages("survival")
> install.packages("survminer")
  also installing the dependencies ‘ggrepel’, ‘ggsci’, ‘ggsignif’, ‘polynom’, ‘exactRankTests’, ‘mvtnorm’, ‘KMsurv’, ‘zoo’, ‘km.ci’, ‘xtable’, ‘ggpubr’, ‘maxstat’, ‘survMisc’, ‘cmprsk’

# Tools for Single Cell Analysis
> install.packages('Seurat')
  also installing the dependencies ‘httpuv’, ‘sourcetools’, ‘bitops’, ‘lsei’, ‘bibtex’, ‘gbRd’, ‘shiny’, ‘later’, ‘caTools’, ‘R.oo’, ‘R.methodsS3’, ‘npsurv’, ‘globals’, ‘listenv’, ‘Rdpack’, ‘hexbin’, ‘crosstalk’, ‘promises’, ‘gplots’, ‘R.utils’, ‘ape’, ‘fitdistrplus’, ‘future’, ‘future.apply’, ‘ggridges’, ‘ica’, ‘igraph’, ‘irlba’, ‘lmtest’, ‘metap’, ‘pbapply’, ‘plotly’, ‘png’, ‘RANN’, ‘reticulate’, ‘ROCR’, ‘rsvd’, ‘Rtsne’, ‘sctransform’, ‘SDMTools’, ‘tsne’, ‘RcppEigen’, ‘RcppProgress’

# INSTALL BIOCONDUCTOR TOOLS
> install.packages("BiocManager")
> BiocManager::install("DESeq")
> BiocManager::install("DESeq2")
> BiocManager::install("GenomeInfoDb")
> BiocManager::install("DNAcopy")

4) Install Rstudio

Rstudio is a nice graphical interface for using R. It made it much less scary to use R for making graphs as you get instantaneous feed back. To download search google for "R studio" and follow the links to download the Open Source version of RStudio Desktop.

To install, click on the download (RStudio-1.2.1335.dmg) and drag the RStudio application to your applications folder.

5) Solarize the Terminal

I like colors and you can configure the terminal to use colors to nicely differentiate file types and folders plus when using vim as a text editor you can nicely color different elements. There are many ways of doing this but the best and easiest one I've found is a color scheme called "solarized" by Ethan Schoonover. There are two versions, light and dark, you can pick.

The following steps will set this color scheme as the default on your MAC:

STEP A - Download the solarized code and configure the vimrc and shell profile

### Open a Fresh Terminal Window
# Create a folder for GitHub Repositories
mkdir -p local/git_repositories
# Move into the created folder
cd local/git_repositories
#Download the solarized package
git clone git://github.com/altercation/solarized.git
#Enter the downloaded package, make the required directory and copy the color scheme
cd solarized/vim-colors-solarized/colors
mkdir -p ~/.vim/colors
cp solarized.vim ~/.vim/colors/
## This will ensure the VIM text editor will use the solarized format
vim ~/.vimrc
#Add the following lines (see VIM section on how to use vim)
syntax enable
set background=dark
colorscheme solarized
## This will ensure the Terminal will use the solarized format when you use the unix "ls" command
vim ~/.profile
#Add the following line to your profile
export CLICOLOR=1

STEP B - Setup the Mac Terminal by double-clicking the solarized app, and then open the Terminal>Preferences and set the On startup, open: New window with profile "Solarized Dark ansi", then set New windows open with: Same Profile and New tabs open with: Same Profile

6) Install/Update Python

Macs ship with python 2.7.10 but if you are starting out it is not a bad idea to update to python 3.x. To download the newer version google "python" and follow the links to download the "macOS 64-bit installer".

To install, click on the install package (python-3.7.3-macosx10.9.pkg).

NOTE/WARNING: Several essential Mac applications use python 2.7, so you need to leave it as default but you can easily use the 3.4 environment

Open the python interpreter in the terminal

python3

Exit the interpreter

exit()

In a script use the following header line

#!/usr/bin/env python3
OR CALL SCRIPT WITH
python3 YourSupperCoolScript.py

There are a number of very valuable packages you might want for scientific computing included in the scipy.org library such as numpy and pandas, along with other useful packages. They can be installed using pip as follows:

pip3 install --upgrade pip
pip3 install numpy (version 1.16.3)
pip3 install ipython (version 7.5.0)
pip3 install scipy (version 1.3.0)
pip3 install matplotlib (version 3.0.3)
pip3 install sympy (version 1.4)
pip3 install pandas (version 0.24.2)
pip3 install jupyter (version 1.0.0)
pip3 install pysam (version 0.15.2)  # Samtools in python
pip3 install pyvcf (version 0.6.8)   # VCF manipulation in python

7) Install Pycharm

This is a python IDE with good plugins for bash and other programs. There are two versions; a community version and professional version. The professional version requires a license that is generally free academic use but the community version is excellent and what I use today.

To install, google Pycharm, and follow the prompts to download the program and click on the install package (pycharm-community-2019.1.2.dmg).

Configure to use python3:

PyCharm>Preferences>Build, Execution, Deployment>Console>Python Console

Select Python 3 version from the "Python Interpreter" drop-down list

8) Install MacPorts

There are many unix commands that do not come pre-loaded on Macs like dos2unix, wget or md5sum. To install these and many other tools MacPorts is an excellent resource. To download, google "MacPorts" and follow the "Installing MacPorts" instructions for your OS version (Make sure Xcode is available).

To install, click on the download (MacPorts-2.5.4-10.14-Mojave.pkg)

Test the install by opening a new terminal window and typing the following

port version

Assuming this produced the version not "command not found" you are ready to install needed ports as follows in a terminal window (install updates your .profile, so a restart of the terminal is required), select Yes to install dependencies:

sudo port install wget
sudo port install dos2unix
sudo port install md5deep
sudo port install gawk
sudo port install cairo        ## Used to install Pairoscope
sudo port install doxygen      ## Used to install Pairoscope
sudo port install cmake        ## Used to install Pairoscope

9) Install Homebrew

This is another package manager like macports. I've gone back and forth between the two as certain applications are available or not on each or more or less up-to-date. I generally lean to macports but recently I've encountered a number of issues that could only be resolved by homebrew

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
#Install required tools that are not available from macports
brew install pkg-config       ## Needed to install perl GD package required for circos
brew install libgd            ## Needed to install perl GD package required for circos
brew install xz               ## Needed for bedtools compile to provide lzma.h

10) Update JAVA - creating a Broad Institute nightmare..version to 11 (version 11, required by IGV)

# Many of the tools you will use leverage JAVA and some need the JDK to be available. One primary tool we use regualarly is the Integrated Genomics Viewer (IGV) from the Broad Institute. However, at the time of writing this document I've encountered an irritating issue. A Mac OS 10.14.5 build ships with JAVA version 6 BUT the current version of IGV requires JAVA version 11 and the GATK toolkit, also from the Broad Institute, requires JAVA version 8. So to use both programs you have an issue as issuing the command "java" will only use one version, seems to be the most recent release. To get around this I have a partial solutions today of installing Java11, then Java8 and when calling GATK you hard code the java executable to be used. This works but the provided GATK wrapper script is still erroring even though the command it prints works perfectly fine.

Okay to get at least IGV working do the following, determine if you have a JDK and if not to install do the following:

# Open a new terminal window
# Type the following
java -version
# I got the following output
java version "1.6.0_65"
Java(TM) SE Runtime Environment (build 1.6.0_65-b14-468)
Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-468, mixed mode)
# However, this does not work with the current IGV, install the recommended version 11

To Download: https://www.oracle.com/technetwork/java/javase/downloads/index.html

After accepting the license I downloaded the Mac OS X version (jdk-11.0.3_osx-x64_bin.dmg). Double click on the downloaded DMG file and then follow the package install prompts. Quit the terminal application then open a new terminal window and check the version

java -version
#Output should now indicate the updated or installed version of JAVA
java version "11.0.3" 2019-04-16 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.3+12-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.3+12-LTS, mixed mode)

Now to get things working with GATK you need to do the following. Download the Java8 Developers JDK (jdk-8u211-macosx-x64.dmg), double click the downloaded DMG file and follow the package install prompts. Quit the terminal application then open a new terminal window and check the versions available.

java -version
#Still shows JAVA11 as the default
java version "11.0.3" 2019-04-16 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.3+12-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.3+12-LTS, mixed mode)

# To see what versions of JAVA are available on your system
/usr/libexec/java_home -verbose
Matching Java Virtual Machines (4):
    11.0.3, x86_64: "Java SE 11.0.3" /Library/Java/JavaVirtualMachines/jdk-11.0.3.jdk/Contents/Home
    1.8.0_211, x86_64: "Java SE 8" /Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home
    1.6.0_65-b14-468, x86_64: "Java SE 6" /Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
    1.6.0_65-b14-468, i386: "Java SE 6" /Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home

# If you want to use JAVA8 specifically at the command line do the following:
/usr/libexec/java_home -v 1.8.0_211 --exec java -version
java version "1.8.0_211"
Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)

Its frustrating that you get this issue but at least this way you can execute IGV with no modifications and at least on my personal laptop I do this 100x more than I ever use GATK so for now this is my implementation. For more details on the exact usage of GATK see the GATK section further down in this document

11) Install IGV

Integrated Genomics Viewer (IGV), is the defacto standard for a light weight GUI application that allows you to visualize a multitude of genomic data formats. There are several version available on the IGV website (http://www.broadinstitute.org/igv/). We generally use the binary version as it can be configured for your exact system unlike the Java Web Start versions. The file can be downloaded through your web browser, but since we have now installed "wget" we will do it from the command line.

Right click on the binary download link (you may have to login first) and select "copy link address"

# Open a new terminal window
cd local
wget http://data.broadinstitute.org/igv/projects/downloads/2.5/IGV_2.5.2.zip
## This was the version at time of writing
unzip IGV_2.5.2.zip
cd IGV_2.5.2
# Update the igv.sh file to set Xmx2g to the appropriate value for your system
# Xmx is the maximum amount of memory you will allow IGV to use
vim igv.sh
exec java --module-path="${prefix}/lib" -Xmx16g
# Now add an alias to your .profile so you can open IGV by typing IGV into the terminal
vim ~/.profile
# Add the following alias
alias IGV='$HOME/local/IGV_2.5.2/igv.sh'
# Now to start IGV open a terminal window and type:
IGV

12) Install Sequence Analysis Tools

a) htslib - This package is part of the samtools distribution and installing it will make tabix and bgzip available. It is now core to samtools and bcftools, and seems to be required to be installed before bcftools.

Download the current release of htslib from github, right click on the "htslib-x.x.tar.bz2" to capture the address to the file of interest.

# Open a new terminal window and move into the local folder
cd local
# Download the current release using wget
wget https://github.com/samtools/htslib/releases/download/1.9/htslib-1.9.tar.bz2
# Decompress downloaded archive, enter folder and compile
tar xvjf htslib-1.9.tar.bz2
cd htslib-1.9/
./configure
make
# This copies the relevant binaries to your $HOME/local/bin directory
make prefix=$HOME/local install

b) Samtools - This is the base toolset for most manipulations of sequence/binary alignment maps (SAM/BAM)

Download the current release of samtools from github, right click on the "samtools-x.x.tar.bz2" to capture the address to the file of interest.

cd ~/local
wget https://github.com/samtools/samtools/releases/download/1.9/samtools-1.9.tar.bz2
tar xvjf samtools-1.9.tar.bz2
cd samtools-1.9
./configure
make
make prefix=$HOME/local install

c) Bcftools - This package is variant calling VCF maniputlation suite produced by the samtools team.

Download the current release of bcftools from github, right click on the "bcftools-x.x.tar.bz2" to capture the address to the file of interest

cd ~/local
wget https://github.com/samtools/bcftools/releases/download/1.9/bcftools-1.9.tar.bz2
tar xvjf bcftools-1.9.tar.bz2
cd bcftools-1.9
make
make prefix=$HOME/local install

d) Bedtools - This program is very useful for manipulating bed files and counting things

To get the most recent version search for "bedtools github", make sure to select the bedtools2 repository and then copy the address link

cd ~/local/
wget https://github.com/arq5x/bedtools2/releases/download/v2.28.0/bedtools-2.28.0.tar.gz
tar xvzf bedtools-2.28.0.tar.gz
cd bedtools2
make
cd bin
cp bedtools ~/local/bin

e) SeqTK - This program allows you to manipulate fasta and fastq files

cd ~/local/
wget https://github.com/lh3/seqtk/archive/v1.3.tar.gz
tar xvzf v1.3.tar.gz
cd seqtk-1.3/
make
cp seqtk ~/local/bin

f) BWA - This is likely the most heavily used and broadly supported next-generation sequencing aligner. To download search google for "bwa aligner" and follow the links to download the current version

# Right click on the download and select "copy link address", then use wget in the terminal to download

cd ~/local
wget https://github.com/lh3/bwa/releases/download/v0.7.17/bwa-0.7.17.tar.bz2
tar xvjf bwa-0.7.17.tar.bz2
cd bwa-0.7.17
make
cp bwa ~/local/bin

g) Picard - This set of JAVA applications provides a series of essential tools required for a multitude of sequencing analysis steps. To download, search google for "picard tools" and follow the "Latest Download" link to download the current version

# Right click on the download and select "copy link address", then use wget in the terminal to download

cd ~/local
mkdir picard-2.20.1   ## (VERSION SPECIFIC FOLDER)
cd picard-2.20.1
wget https://github.com/broadinstitute/picard/releases/download/2.20.1/picard.jar
# Put a copy of the jar file in the $HOME/local/bin
cp picard.jar ~/local/bin/

h) GATK - This JAVA application contains a number of "best-practices" applications for post-processing alignment files. It also has a large number of additional applications that you will use frequently. To download, search google for "GATK tools" and follow the download links.

cd ~/local
wget https://github.com/broadinstitute/gatk/releases/download/4.1.2.0/gatk-4.1.2.0.zip
unzip gatk-4.1.2.0.zip
cd gatk-4.1.2.0

./gatk --help          ## This works
./gatk --list          ## This ERRORS, because default java is v11 not needed v8

### PROBLEM, GATK4 requires java 8, while IGV requires java 11 so running both is a bit of a mess, I don't really need a solution beyond testing, but want java 11 as the default so IGV works without any effort. 

### SEE THE UPDATING JAVA SECTION ABOVE FOR MORE DETAILS ABOUT JAVA INSTALLS

Today I've found a solution that calls the GATK local jar file directly without using the GATK wrapper provided by the GATK team. This should work fine as I got the idea for the solution from this GATK documentation page that has some useful hints on how to manage this issue.

# To call the GATK .jar file directly this works as a command after installing JAVA8 as noted above.
/usr/libexec/java_home -v 1.8.0_211 --exec java -jar $HOME/local/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar --help   
To enable as a more simple executable I created an alias as follows:
vim ~/.profile
#Add this line to the Program Alias section
alias GATK4='/usr/libexec/java_home -v 1.8.0_211 --exec java -jar $HOME/local/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar'
#Restart the terminal and test the implementation
GATK4 --help

I've tried editing the GATK launcher script provided in the distribution folder by updating the command line created at line 208, this still fails, but with a different issue than calling the wrapper unedited. Oddly, the cmd printed with the error works perfectly fine at the command line alone so it must be a validation step in the wrapper

# Original function on lines 207 and 208
def formatLocalJarCommand(localJar):
  return ["java"] + PACKAGED_LOCAL_JAR_OPTIONS + [ "-jar", localJar]

# Original error message when executing ./gatk --list
Using GATK jar /Users/jkeats/local/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar
Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /Users/jkeats/local/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar --help
Exception in thread "main" java.lang.IncompatibleClassChangeError: Inconsistent constant pool data in classfile for class org/broadinstitute/barclay/argparser/CommandLineProgramGroup. Method lambda$static$0(Lorg/broadinstitute/barclay/argparser/CommandLineProgramGroup;Lorg/broadinstitute/barclay/argparser/CommandLineProgramGroup;)I at index 43 is CONSTANT_MethodRef and should be CONSTANT_InterfaceMethodRef
at org.broadinstitute.barclay.argparser.CommandLineProgramGroup.<clinit>(CommandLineProgramGroup.java:19)
at org.broadinstitute.hellbender.Main.printUsage(Main.java:384)
at org.broadinstitute.hellbender.Main.extractCommandLineProgram(Main.java:342)
at org.broadinstitute.hellbender.Main.setupConfigAndExtractProgram(Main.java:182)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:204)
at org.broadinstitute.hellbender.Main.main(Main.java:291)

# Edited function
def formatLocalJarCommand(localJar):
  return ["/usr/libexec/java_home -v 1.8.0_211 --exec java"] + PACKAGED_LOCAL_JAR_OPTIONS + [ "-jar", localJar]

# New error message when executing ./gatk --list
Using GATK jar /Users/jkeats/local/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar
Running:
    /usr/libexec/java_home -v 1.8.0_211 --exec java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /Users/jkeats/local/gatk-4.1.2.0/gatk-package-4.1.2.0-local.jar --help
Traceback (most recent call last):
  File "./gatk", line 479, in <module>
    main(sys.argv[1:])
  File "./gatk", line 152, in main
    runGATK(sparkRunner, sparkSubmitCommand, dryRun, gatkArgs, sparkArgs, javaOptions)
  File "./gatk", line 328, in runGATK
    runCommand(cmd, dryrun)
  File "./gatk", line 384, in runCommand
    check_call(cmd, env=gatk_env)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 535, in check_call
    retcode = call(*popenargs, **kwargs)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 522, in call
    return Popen(*popenargs, **kwargs).wait()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 710, in __init__
    errread, errwrite)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1335, in _execute_child
    raise child_exception

i) Circos - This tool is used to produce the fancy circular diagrams that show mutations, structural differences, and copy number differences. You can fine tune these plots to the nth degree. Unfortunately, this is one of the most difficult programs to get installed but the documentation is excellent making it well worth the effort. To download, google "circos plot" or follow the link to (circos.ca) and follow the links to download the core distribution and tools.

#Right click on the most recent core version and select "copy link address", then paste after wget in terminal to download

cd ~/local/
wget http://circos.ca/distribution/circos-0.69-6.tgz
tar xvzf circos-0.69-6.tgz
# Now for the fun part of getting everything setup...  I promise it works
cd circos-0.69-6/bin/
# Test which required modules are available or missing
./circos -module
## This is the output that was produced on my machine
ok       1.29 Carp
ok       0.36 Clone
missing            Config::General
ok       3.40 Cwd
ok      2.145 Data::Dumper
ok       2.52 Digest::MD5
ok       2.84 File::Basename
ok       3.40 File::Spec::Functions
ok       0.23 File::Temp
ok       1.51 FindBin
missing            Font::TTF::Font
missing            GD
missing            GD::Polyline
ok       2.39 Getopt::Long
ok       1.16 IO::File
ok       0.33 List::MoreUtils
ok       1.38 List::Util
missing            Math::Bezier
ok      1.998 Math::BigFloat
ok       0.06 Math::Round
missing            Math::VecStat
ok       1.03 Memoize
ok       1.32 POSIX
ok       1.08 Params::Validate
ok       1.61 Pod::Usage
missing            Readonly
ok 2013031301 Regexp::Common
missing            SVG
missing            Set::IntSpan
missing            Statistics::Basic
ok       2.41 Storable
ok       1.17 Sys::Hostname
ok       2.02 Text::Balanced
missing            Text::Format
ok     1.9725 Time::HiRes

## Now install all the missing modules using the CPAN downloader in the terminal
sudo perl -MCPAN -e shell

## This caused some new issues, maybe TGen permissions, and I had to answer three questions:
1) Would you like to configure as much as possible automatically? [yes] yes

Warning: You do not have write permission for Perl library directories.
To install modules, you need to configure a local Perl library directory or
escalate your privileges.  CPAN can help you by bootstrapping the local::lib
module or by configuring itself to use 'sudo' (if available).  You may also
resolve this problem manually if you need to customize your setup.

2) What approach do you want?  (Choose 'local::lib', 'sudo' or 'manual') [local::lib] local::lib

3) Would you like me to automatically choose some CPAN mirror sites for you? (This means connecting to the Internet) [yes] yes

# Order use to matter, at least GD need to be done first
install GD
install Config::General
install Font::TTF::Font
install GD::Polyline
install Math::Bezier
install Math::VecStat
install Readonly
install SVG
install Set::IntSpan
install Statistics::Basic
install Text::Format
exit
## Now test if all the required modules are now available
./circos -module
# If all went well each and every module should now be marked as 'ok'

#Download the circos tools

cd ~/local/
wget http://circos.ca/distribution/circos-tools-0.23.tgz
tar xvzf circos-tools-0.23.tgz

#Download the tutorials and Test the Install

cd ~/local/
# There is no version 0.69 tutorials so using the last one provided at this time
wget http://circos.ca/distribution/circos-tutorials-0.67.tgz
tar xvzf circos-tutorials-0.67.tgz
cd circos-tutorials-0.67/tutorials/2/2/
~/local/circos-0.69-6/bin/circos -conf circos.conf

You should see a series of messages print to the terminal screen. If you navigate to the folder using the finder window you should see a new file called "circos.png". Open the image file and check to ensure it produced a circular image of each human chromosome with each chromosome in a different color.

#To make it easier to use circos we will add an alias to our profile

# Open your profile and add an alias to the circos binary
vim ~/.profile
alias CIRCOS='$HOME/local/circos-0.69-6/bin/circos'
# Close the terminal application and reopen to test the alias function
cd local/circos-tutorials-0.67/tutorials/2/2/
CIRCOS -conf circos.conf

As our pipeline developer would say... "Much Success!!"

j) Pairoscope - This one was painful to sort out... yet again. But if you follow these instructions it should work well. To access the download you will use "git" but to see how we got to things you can access the Wash U tools website (http://tvap.genome.wustl.edu/tools/) and follow the links to the pairoscope download.

cd ~/local/
git clone https://github.com/genome/pairoscope.git
mkdir pairoscope/build
cd pairoscope/build
cmake ../
make -j
# If you get an error like I did doing this I can save you a day of effort if you follow these steps
cd vendor/src/gtest160/include/gtest/internal
# Now we need to edit the "gtest-port.h" file to change a 1 to a 0 in the middle of the file (I have no idea why this works but it does)
vim gtest-port.h
# You will need to edit line number 437 from  # define GTEST_HAS_TR1_TUPLE 1   TO    # define GTEST_HAS_TR1_TUPLE 0
# To view line numbers
:set nu
# To navigate straight to line 437
:437
# Enter edit mode by typing "i" and then edit the line so it looks like:
# define GTEST_HAS_TR1_TUPLE 0
# Save the changes, you might have to force the save after you exit edit mode by hitting <esc>
:w!
:q
# Okay now lets try it again
cd ~/local/pairoscope/build/
make -j
# Assuming that worked, now move the binary to the local/bin directory
cd bin
cp pairoscope ~/local/bin

k) FASTQC - This is an excellent program for visually checking the quality of your sequencing reads before alignment

cd ~/local/
wget https://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.8.zip
unzip fastqc_v0.11.8.zip
cd FastQC
chmod +x fastqc
# Add a symbolic link to the $HOME/local/bin $PATH directory so you can call the application easily
ln -s ~/local/FastQC/fastqc ~/local/bin/fastqc