cDNA sequences were base-called and quality-trimmed using phred (trim_cutoff = 0.05) [47], and vector sequences were removed using cross_match [48]. Any sequences of less than 50 bp after trimming were discarded. 3' UTR lengths were estimated by combining approximate insert sizes determined by PCR with 5' sequence data where possible (if the 5' sequence did not extend into the coding region we could not estimate 3' UTR size). We counted cDNAs from a given gene as showing alternative polyadenylation site usage if 3' UTR length estimates varied by at least 400 bp - smaller variation could be real, but may not be distinguishable from error in our size estimates.