https://cm.jefferson.edu/novel-mirnas-2015/ Download instructions: a. Visit URL https://cm.jefferson.edu/novel-mirnas-2015/ and download the predictions for the novel miR of interest b. Uncompress them If on a unix-based system: tar zxvf .tar.gz If on Windows: a zip utility such as winzip should support unwrapping *.tar.gz files Contents: readme.txt - (this file) .pdf - (PDF from miRDeep2 showing details and graphs regarding the discovery of the molecule) .1.7.12.-12.7000000.1.-1.2.1.1.txt - (ENSEMBL75 cDNA target-site predictions for the novel miR. RNA22 v2.0 was used for the predictions. See below on how to read the output) How to read the prediction file: a. Example: TJU_CMC_MD2.ID00001.5p-miR cDNA|ENSG00000010072|ENST00000008440|1|1|SPRTN SEQ_FROM_257_277 0 test.seq -12.40 GGACTTGCAGGCACTGTTTGT GCGAACAAAGCCCCGGGACG ...((((..(((..((((((( )))))))..))).))))... 14 14 21 0 0 0.283000 b. How to read: Each line of the miR file is a predicted target site for that miR. Column 1 (and filename): name of miR Column 2: Ensembl Gene ID, Ensembl Transcript ID, chromosome, and strand (-1 for reverse, 1 for sense) Column 3: Used to calculate the start/end location of the predicted target site that the miR targets - see section below. Column 4: always 0 in RNA22v2 Column 5: always test.seq (you can ignore this) Column 6: binding energy in -Kcal/mol of the predicted heteroduplex between microRNA and the targeted messenger RNA Column 7: target site sequence Column 8: miR sequence Column 9-10: shows you where the base pairings are for the formed heteroduplex Column 11 & 12: always the same, the number of paired nucleotides in the heteroduplex Column 13: the span/length of the predicted target (will be used to calculate the start/end location of the predicted target site that the miR targets - see section below) Column 14-15: please ignore and do not use Column 16: pValue representing the likelyhood the position is an MRE location (note, the pVal doesn't tell you the likelyhood that the miR is the proper mate for the MRE location. For that, you may want to look at other variables such as # of bulges, binding energy, # GU woobles, etc.) c. How to calculate the local (1-based index) coordinates (These coordinates are relative to the start of the cDNA of the transcript) of the predicted target site: - Use columns 3 and 13 - The 4th column is an offset. So for the example above: - the start location is 257 (from column 3) - the end location is 257 (from column 3) + 21 (from column 13) - 1 = 277 (you may also be able to use the end-location from column 3)