Tools
One of our main research interests is developing new webtools and databases in the field of bioinformatics. We believe that this work transcends the technical effort, since it allows improvement in our data analysis methods and overall, we introduce to the community a little of what is generated in our lab. Below are our available tools:
RCPedia
Retrocopies are copies of mature RNAs that are usually devoid of regulatory sequences and introns. They have routinely been classified as processed pseudo-genes with little or no biological relevance. However, recent findings have revealed functional roles for retrocopies, as well as their high frequency in some organisms, such as primates. Despite their increasing importance, there is no user-friendly and publicly available resource for the study of retrocopies. Here, we present RCPedia, an integrative and user-friendly database designed for the study of retrocopied genes. RCPedia contains a complete catalogue of the retrocopies that are known to be present in human and five other primate genomes, their genomic context, inter-species conservation and gene expression data. RCPedia also offers a streamlined data representation and an efficient query system.
miRIAD
MicroRNAs (miRNAs) are a class of small (~22 nucleotides) non-coding RNAs that post-transcriptionally regulate gene expression by interacting with target mRNAs. A majority of miRNAs is located within intronic or exonic regions of protein-coding genes (host genes), and increasing evidence suggests a functional relationship between these miRNAs and their host genes. Here, we introduce miRIAD, a web-service to facilitate the analysis of genomic and structural features of intragenic miRNAs and their host genes for five species (human, rhesus monkey, mouse, chicken and opossum). miRIAD contains the genomic classification of all miRNAs (inter- and intragenic), as well as classification of all protein-coding genes into host or non-host genes (depending on whether they contain an intragenic miRNA or not). We collected and processed public data from several sources to provide a clear visualization of relevant knowledge related to intragenic miRNAs, such as host gene function, genomic context, names of and references to intragenic miRNAs, miRNA binding sites, clusters of intragenic miRNAs, miRNA and host gene expression across different tissues and expression correlation for intragenic miRNAs and their host genes. Protein–protein interaction data are also presented for functional network analysis of host genes. In summary, miRIAD was designed to help the research community to explore, in a user-friendly environment, intragenic miRNAs, their host genes and functional annotations with minimal effort, facilitating hypothesis generation and in-silico validations.
piRNAdb
Found in several metazoan species, piRNAs plays their role not only on transposable elements repression, but also in regulating cellular development and differentiation. Beside their importance, piRNAs are not well known and still under explored. A comprehensive and easy-to-use piRNA database is still need and should be an important contribution in making piRNAs widely studied. Results: Here, we present piRNAdb, an integrative and user-friendly database designed for the study of piRNAs. piRNAdb integrate 27,329 piRNAs from four human small RNAseq datasets, as well as their genomic location, clustering information and transposable elements related. We also display information about 23,380 genes that are putative targets of piRNAs, like CNTNAP2, with functions in nervous system development, and SMC5, a DNA repair genes acting on double-strand breaks and differentially expressed in cancer. Related to those genes, we found 47,060 ontology terms, data only available on our website. Moreover, our database also provides a feedback system to improve the information exchange and knowledge among those studying piRNAs. The development of new features to facilitate piRNAs analyses, data visualization and integration is the major pillar of piRNAdb. We believed that our webtool, by providing a streamline and well-organized data repository to piRNAs, will be extremely important not only to those already studying piRNAs, but also to making piRNAs accessible to a broad community.
Reboot
Gene expression data is becoming more and more common and available nowadays with the advent of modern RNA-Seq machines since a few years ago. Furthermore, several tools are available to process such data. However, most of them require a level of expertise restricted to computational biologists. Consequently, clinicians often make decisions regarding a patient's prognosis solely based on mutation data generated by specialized companies, without considering neither their expression levels nor associated alternative splicing patterns. This is especially relevant because not all alterations will be, in fact, expressed in a given tissue of a particular patient under a specific space/time condition. Therefore, this implicates a non-straightforward relationship between the current molecular diagnostic approaches and clinical applications. To improve current translational medicine strategies, we present Reboot (REgression and survival tool with a multivariate BOOTstrap approach): a user-friendly web application to perform survival analysis from high-dimensional gene/transcripts expression datasets. Reboot innovates by using a multivariate strategy with penalized Cox regression (LASSO method) combined with a bootstrap approach, in addition to statistical tests for supporting the findings, which are automatically pre-filtered and plotted. Moreover, Reboot has its more powerful command-line version, whose documentation is well-detailed, allowing even less-experienced users to perform not only modular but also integrative analyses and validations.