Synapse Logo

Supporting open scientific collaborations for clear, reproducible science.


Although it is rapidly becoming more widespread and affordable to collect high-dimensional genomics, molecular, and imaging data in clinical and biological settings, this new era of “big data” has not yet delivered widespread impact on patient outcomes. The bottleneck in discovery is now the ability of scientists to integrate, analyze, and interpret ever more complex and inter-related datasets. Sage Bionetworks has developed Synapse as an informatics platform dedicated to supporting the large-scale pooling of data, knowledge, and expertise across institutional boundaries to solve some of the most challenge problems in biomedical research.

We have designed Synapse with two major categories of user in mind. First, for biomedical data scientists we have created an environment that aids these users in sharing all of their digital research assets including data, code, and analysis results. These assets can be broadly shared and accessed through a unified set of RESTful APIs – through which Synapse provides integration points in R, Python, and the Linux shell in order to enable dissemination of findings across common analytical environments. Furthermore, Synapse provides advanced capabilities for formally tracking the relationship between these digital assets through the Synapse provenance system, and for documenting and disseminating their work in ways that others can reproduce and reuse. These capabilities are key in supporting larger research teams, such as the 200+ scientists that collaborated through Synapse as part of the TCGA Pan Cancer Analysis Working Group.

Second, for clinical and biological scientists we have focused on developing Synapse as a community hub that creates a partnership between computational / analytical users and the more biologically / clinically minded scientists. In particular, the Synapse web portal is designed to enhance this communication and grow communities of scientists with diverse skill sets that can work together to interpret complex and diverse biomedical data sets. For example, our currently running Rheumatoid Arthritis Challenge was developed after we were approached by Dr. Robert Plenge who had collected a rich genomic and clinical dataset by combining data across a dozen clinical studies that assessed the response to anti-TNF⍺ therapy on Rheumatoid Arthritis.