The Mitochondrial and Nuclear tRNA database (MINTbase) is an easy-to-use resource for accessing information on tRNA fragments (tRFs) [1,2]. MINTbase offers five vistas, that is, five different ways to view tRF fragment information. The vistas are easy-to-use, interconnected and flexible providing each user with a personalized experience.
Use of the data contained in this database is limited to research-oriented, non-profit activities only. If you are interested in for-profit use, please contact us.
Here are a few suggestions to help you personalize MINTbase. If you are wondering:
Keep reading to find more information on the filters, the vistas, and more examples!
MINTbase contains all possible tRNA fragments (tRFs) with lengths 16-50 nt that originate from mature tRNAs. The types of tRFs included in the database are 5'-half (or 5'-tRH), 5'-tRF, i-tRF, 3'-tRF, and 3'-half (or 3'-tRH). tRNA halves (tRHs) are the first category of tRNA-derived fragments that was reported [3,4] in the early 2000s. More recently, 5'-tRF, 3'-tRF [5], and i-tRF [6] were identified.
To list all possible tRFs, we start from the mature tRNA sequences and enumerate all possible subsequences. When we add deep sequencing data to the database, we note which tRFs were, in fact, discovered in a sample. We call those tRFs "expressed". MINTbase, by default, shows only expressed tRFs. See more on this on the RPM filter.
MINTbase contains a variety of projects, most notable of which is The Cancer Genome Atlas. MINTbase is regularly updated. The profiles of all samples presented in MINTbase were mapped uning MINTmap, our tRNA derived fragment mapping tool.
MINTbase's filters differ depending on the vista. They can be used to refine the database's tRF results. For example, you can retrieve only the 5'-tRHs of the AlaAGC or SerACT amino isodecoders. Some filters are required, namely, the genome, and the minimum RPM value, while others are optional, e.g., the tRF anticodon or the tRF sequence.
MINTbase currently contains the tRFs of Homo Sapiens GRCh37. The GRCh37 tRNAs types are true tRNA, pseudo tRNA, exact tRNA-lookalikes, and MT-tRNA. We retrieved the tRNAs from gtRNAdb v1 and [7], and mapped against the genome from the UCSC genome browser. All profiles of the projects were mapped using MINTmap.
Reads per million (RPM) measures the abundance of a molecule within a sample. We calculate the RPM by dividing the times a tRF was found in a sample by the total number of sequenced reads of the sample and multiplying by a million. We say that a tRF is expressed when there exists at least one sample submitted to the database in which the tRF has RPM more than 1.
In the database, we include all tRFs that can exist, regardless of whether they are yet expressed or not. Since the tRNA sequence space of a genome is finite, there exists an upper bound to the number of distinct tRF sequences that mature tRNAs can ever produce. As the number of discovered tRFs is likely to increase, it is reasonable to expect that at least some of this currently not-expressed tRFs will be discovered by subsequent studies to be expressed in some setting. The user can select the threshold of their choice using this filter.
When the users select "all tRFs", the expressed and not expressed tRFs of the database will be displayed. Note that not expressed also includes tRFs that had RPM less than 1 in some samples. When the user selects any numerical value, only the tRFs that had an RPM equal or higher than that value in any of the samples currently submitted in MINTbase are displayed.
The tRF types are 5'-half (or 5'-tRH), 5'-tRF, i-tRF, 3'-tRF, and 3'-half (or 3'-tRH).
The type of a tRF depends on its endpoints. 5'-tRHs and 5'-tRFs have an endpoint in the 5' end of the mature tRNA. The difference between a 5'-tRF and a 5'-tRH is that tRNA halves have their other endpoint in the anticodon loop, unlike 5'-tRFs. Similarly, for the 3'-tRFs and 3'-tRHs. Internal tRNA fragments, or i-tRFs, are the tRNA-derived fragments for which none of these rules apply. They begin and end anywhere in the tRNA except the 5' and 3' ends. They are the most populous tRF type.
The type filter allows the user to select one or multiple tRF types. To select multiple types, hold down the shift or command key while you click on the types you need. When you select multiple types, the tRFs that belong in any of the types you selected will appear (in other words, you retrieve the union of the tRF types).
This filter allows the user to select one or multiple amino acids and anticodons, the results of the database will include only tRFs which may originate from tRNAs of those isodecoders. To select multiple types, hold down the shift or command key while clicking on the options you need. When you select multiple options, the results are the combination of each separate selection (in other words, you will see the union of your selections).
We are following the notation (n) and (mt) to indicate nuclear and mitochondrial tRNAs. We also use (mt-la) to indicate tRNA-lookalikes.
You can use this filter to select samples of a specific tissue of disease, only tRFs that are present in those samples will be returned from the database. Some examples are TCGA-BRCA or Lung, you can also view the full list of keywords. The search can identify misspellings.
To ensure the effort-less use of MINTbase, we accommodated multiple tRNA nomenclatures and thus allow the use of different tRNA identifiers when composing a search. The nomenclatures recognized are our own names, HGNC symbols, Legacy IDs, and gtRNAdb IDs.
This field accepts any nucleotide sequence. If the sequence is an expressed tRF in the selected genome, MINTbase will return the results. If it's not, suggestions will be made for similar tRFs or MINTbase will report no results. We recommend that you set RPM to "all" in case your fragment is a tRF but is not yet expressed in the database.
This field accepts License Plates and an additional nomenclature we have used on earlier papers. Similarly to the tRF sequence filter, we recommend that you set RPM to "all" in case your fragment is a tRF but is not yet expressed in the database.
The user can select chromosome, strand, and genomic start and end. All tRFs that have at least one nucleotide in the selected range, and are expressed in the database, will be returned.
MINTbase offers five distinct vantage points or "vistas".
The database's "Genomic Loci" vista provides access to all possible genomic origins of the sought tRFs and their characteristics with reference to the source tRNA genes. The "RNA Molecule" vista provides access to the distribution of a tRF across all tRNAs in tRNA space. The "tRNA Alignment" vista visualizes the tRF(s) in the sequence context of the parental tRNA. The "Expression" vista provides information about the different samples, tissues, diseases, etc. in which each tRF has been reported expressed, together with the corresponding Pub-Med identifier. Finally, the "tRF Summary" vista summarizes all the information in MINTbase for each tRF individually in the form of a "record".
This vista provides a genome-wide overview of the database's tRFs. Given the source ambiguity of some tRFs, this vista's output usually comprises multiple rows that refer to all of the genomic instances of the same tRF sequence. Each of the optional search filters, e.g., search by tRNA label, can be applied at the genome level to allow the user to further sub-select among the currently reported results. The Genomic Loci vista is meant for those users who want to explore similarities among tRFs at the genome level. The data is presented in a tabular format that contains one genomic instance per row.
Example: You can visit the genomic loci vista to view all the five possible tRNAGlyGCC parental tRNAs of the 5'-half tRF-31-P4R8YP9LON4VD.
The counter at the top of the results reports the number of genomic locations that satisfy the search parameters. In this vista a tRF sequence may occupy multiple rows if it's located at multiple loci.
The table can be modifed by several buttons. You can select specific columns to view (some are always enabled, some are dy default disabled), you can change the results displayed per page (default is 10), and you can navigate to other pages. Finally, you can download to retrieve all results in simple text format. In the Genomic Loci vista, only visible columns will be returned.
The main results are presented in a table and are sortable. Click on a column header to sort it in ascending order. Hold shift and click again to sort in descending order. Hold shift and click on two or more columns to sort in any number of columns in any order you want. Hover over the [i] icon to view details about the column's contents.
The results consist of the columns described below.
This vista is molecule-centric and presents a summary of the fragment's basic characteristics. The RNA Molecule vista is meant for those users who want to obtain basic level information of the tRF of interest and its potential origins in a summarized format. The data is presented in a tabular format in this vista, with each row containing a unique tRF sequence.
Example: You can visit the RNA molecule vista to view the 5'-tRF tRF-23-RK9P4P9LDS which contains the D-loop, has 13 potential parental tRNAs of 5 distinct anticodons.
The counter shows the number of distinct tRFs that have at least one parental tRNA that fits the selection of filters. Information from all potentail parental loci is used to complete the table.
The table can be modifed by several buttons. You can change the results displayed per page (default is 10) and you can navigate to other pages. Finally, you can download to retrieve all results in simple text format.
The main results are presented in a table and are sortable. Click on a column header to sort it in ascending order. Hold shift and click again to sort in descending order. Hold shift and click on two or more columns to sort in any number of columns in any order you want. Hover over the [i] icon to view details about the column's contents.
The results consist of the columns described below.
This vista is mature-tRNA-centric and presents the possible alignments between tRFs and the parental tRNA gene. The tRNA Alignment vista is meant for users who want to explore the tRF-generation potential of a tRNA across each of the five structural types or wish to visualize the tRF(s) of their choice. If more than one tRNA is selected, the user must select one before proceeding to the alignment.
If a tRNA is not selected, then MINTbase presents all tRNAs that fit the filters selected. The user has to click on the image on the last column to proceed to the alignment.
Example: You can use the tRNA alignment vista to view the alignments of tRFs on AlaAGC tRNAs.
The counter shows the number of tRNAs that fit your parameters. You have to select one to continue to the alignment.
The table can be modifed by several buttons. You can change the results displayed per page (default is 10) and you can navigate to other pages. Finally, you can download to retrieve all results in simple text format.
The main results are presented in a table and are sortable. Click on a column header to sort it in ascending order. Hold shift and click again to sort in descending order. Hold shift and click on two or more columns to sort in any number of columns in any order you want. Hover over the [i] icon to view details about the column's contents.
The results consist of the columns described below.
Once a tRNA is selected, the user can see the alignment of that tRNA and all tRFs that have been selected by the filters of the database.
Example: Continuing the above example, if you select tRNA166-AlaAGC, you can view the alignment.
The counter shows the number of tRF that satisfy the search parameters and can originate from the specific tRNA. Beside the counter, you can see the color-legend used in the page.
The tRNA information title includes tRNA name, assembly and tRNA type. The link leads to the genomic location within the UCSC genomic browser.
The tRNA alignment vista has three buttons, the first "Print", is independent and by clicking it, you can print the page or save it to a pdf. Make sure that "print background colors" or a similar button is clicked when you open the print form. The other two buttons, "Reset" and "Undo", work together with the red X buttons located on the right of each tRF. You can click the X button on the right of a tRF, to remove the tRF from the page. You can undo the last action by clicking "undo", undo multiple actions by clicking it repeatedly, or reset the page to its initial state by clicking "reset". When you click "reset" the intron (if exists) will also return to its default visualization. You can use the "X" buttons to choose the tRFs you want in the visualization before you print.
When the tRNA contains an intron, you can click the "Show/Hide intron" button to show or hide the intron from both the tRNA and the tRF areas.
The rest of the tRNA header contains the tRNA secondary structure, a ruler, and the tRNA sequence. The structure of tRNAs is obtained by gtRNAdb and for tRNA lookalikes, we use the same secondary structure as for the source tRNA structure. The numbered ruler provide an easy way to find the location of the tRF in the spliced or unspliced tRNA (depended on whether the intron is hidden or not). The vertical bar symbol "|" indicates a mutliple of ten (first is ten, second is twenty etc). Finally, the tRNA header is stationary and you can scroll up and down the tRFs.
Under the tRNA header is the scrollable tRF area. The tRF's position is aligned with the tRNA, and the structures are also indicated by using color labels. The tRF sequence is a link to the summary vista. Alongside the sequence, two numbers are displayed, and the button removes the tRF from the display. The first number shows how many samples are currently submitted to MINTbase and contain the tRF and satisfy all other selected criteria (e.g., samples with RPM over 5 or breast tissue samples). The second number shows the maximum RPM of the tRF in those samples. Both numbers are links to Expression vista. There you can see specific information about these samples.
This vista presents information about the samples that include a specific tRF. If the search parameters retrieved multiple tRFs from the database, then the user will select their sequence of interest by clicking on the number of samples or the maximum RPM value.
If a tRF is not selected, MINTbase will prompt you to select one before you can proceed to view the samples.
Example: You can use the tRNA Expression vista to view the samples in which the 5'-halves of tRNATyrGTA are expressed.
The counter at the top of the results reports the number of tRFs that satisfy the search parameters. When a user selects a tRF they can click on the column "# of datasets" to proceed to the list of samples in which the tRF is present.
The table can be modifed by several buttons. You can change the results displayed per page (default is 10), and you can navigate to other pages. Finally, you can download to retrieve all results in simple text format.
The main results are presented in a table and are sortable. Click on a column header to sort it in ascending order. Hold shift and click again to sort in descending order. Hold shift and click on two or more columns to sort in any number of columns in any order you want. Hover over the [i] icon to view details about the column's contents.
The results consist of the columns described below.
When one tRF is selected, one row will be shown for each deep-sequencing sample that is included in MINTbase and contains the tRF.
Example: Continuing the above example, you can view the samples in which the 5'-half tRF-30-ROD8N0X0JYOY is expressed. Also see the full list of the samples currently in MINTbase.
The counter at the top of the results reports the number of samples that are currently in the database, contain the tRF and satisfy the search parameters. When a user clicks on the tRF name or sequence on the title, they can open the Summary vista for the tRF.
The table can be modifed by several buttons. You can select specific columns to view (some are always enabled, some are dy default disabled), you can change the results displayed per page (default is 10), and you can navigate to other pages. Finally, you can download to retrieve all results in simple text format. In the Expression vista, only visible columns will be returned.
The main results are presented in a table and are sortable. Click on a column header to sort it in ascending order. Hold shift and click again to sort in descending order. Hold shift and click on two or more columns to sort in any number of columns in any order you want. Hover over the [i] icon to view details about the column's contents.
The results consist of the columns described below.
This vista summarizes information about a tRF. This serves not only as a reference for each molecule but also as a starting point for exploring the genomic and molecular characteristics of a tRF (e.g., its type(s), the potential tRNA gene sources, the number of instances in tRNA space, etc.) across the available vistas. The vista offers several graphs of the RPM and the number of samples that contain the sample.
If a tRF is not selected, the summary vista prompts the user to select one. The table shown is identical to the table shown in the Expression vista when a tRF is not selected. To proceed to the Summary vista for the tRF, please click on the tRF sequence. To view details about the columns of this table please visit the Expression vista section above.
When one tRF is selected, MINTbase Summary vista presents all known information about the tRF.
Example: You can view the Summary of the 5'-tRF tRF-19-PS5P4PJ4.
The results consist of the entities described below.
MINTbase also contains a variety of graphs. The graphs are created using the Highcharts and are interactive. The three bars on the right top of each bar, allow you to view in full screen, print and download the graph in various formats including svg (which you can edit). Additionally, if you hover over the graph area you can view details about the data drawn. Clicking on the legend shows/hides the projects.
The first bar graph shows the percentage of samples per project that contain the tRF with RPM ≥ 1.0. 100% in TCGA-BRCA means that all TCGA-BRCA samples in MINTbase contain this tRF. The "Non-TCGA" column contains all samples that are in MINTbase and are not from TCGA.
The second bar graph shows the number of samples per project that contain the tRF within a specified RPM interval. E.g., In the example provided at the beginning of this section, TCGA-LIHC is 44 in the 50-100 interval; that means that 44 samples of LIHC in TCGA contain the tRF with RPM more or equal to 50 and less than 100.
The first boxplot shows the RPM values per project. Hovering over each boxplot, you can see the quartile values and the number of samples. The second boxplot shows the log2 RPM values per project.
Finally, if the user selects a tRNA instead of a tRF, they can see the information MINTbase contains about that tRNA and how the tRNAs pile up on the tRNA.
Example: You can view the Summary of a tRNA in MINTbase.
The results consist of the entities described below.
The pie chart shows the numbers of all potential tRFs versus the numbers of the tRFs that are expressed in the samples that the database currently contains. The bar charts pile up the expressed fragments on the tRNA. Each fragment counts once, regardless of the maximum RPM or in how many samples it can be found.
MINTbase was developed in 2016 [1] and had a major update in 2017 [2]. Samples are currently being added to MINTbase without changing the version number. Please note that results might change since the last time you run a search.
Initial version is released. MINTbase is written in MySQL, Java and JavaScript. Samples are analyzed using MINTmap. Charts are created using Highcharts.