RNA sequencing

RNA sequencing is performed on all HipSci iPS cell lines that are selected for banking after passing QC. Sequencing and primary analysis are performed at the Wellcome Trust Sanger Institue.

Primary analysis

HipSci’s RNA-seq analysis pipeline is to map sequence reads to the human GRCh37 reference using the STAR spliced aligner. The mapping uses version 19 of the Gencode gene annotation to enable splice-aware alignments.

Getting the data

Complete lists of exome-seq data can be found under the files tab of the cell lines and data browser or in the dataset indexes on the FTP site.

  • Raw sequencing reads – Distributed in the cram file format. Any cell line can have multiple associated cram files; each corresponds to a single lane of sequencing.
  • Splice-aware STAR alignment – Distributed in the bam file format. We distribute one bam file per cell line.

For managed access cell lines, RNA-seq files are archived in the EGA. The data browser contains links to the relevant EGA dataset page, from where researchers can request access to the data.

For open access cell lines, RNA-seq files are archived in ENA. Data are openly available to anybody, and the data browser contains direct links to the files on the ENA FTP server.


HipSci’s FTP site contains: