The expression profile and sequence variants of 476 early stage urothelial carcinoma were studied using whole transcriptome sequencing. RNA-Seq libraries were prepared by Ribo-Zero treatment of total-RNA (to reduce the rRNA content) followed by library preparation using ScriptSeq. RNA-Seq libraries were paired-end sequenced (2x 101 bp) on Illumina HiSeq 2000 and the resulting fastq files were processed using tools from the Genome Analysis Toolkit (GATK and from the Tuxedo suite. Access to the sequence data (bam and vcf files), containing person identifying information, needs signature on a controlled access form, and can be accessed at The European Genome-phenome Archive (EGA) using the study ID EGAS00001001236 following request. An expression matrix of FPKM values are available without restriction at ArrayExpress.