TY - JOUR
T1 - Predicting the protein half-life in tissue from its cellular properties
AU - Rahman, Mahbubur
AU - Sadygov, Rovshan G.
N1 - Publisher Copyright:
© 2017 Rahman, Sadygov. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2017/7
Y1 - 2017/7
N2 - Protein half-life is an important feature of protein homeostasis (proteostasis). The increasing number of in vivo and in vitro studies using high throughput proteomics provide estimates of the protein half-lives in tissues and cells. However, protein half-lives in cells and tissues are different. Due to the resource requirements for researching tissues, more data is available from cellular studies than tissues. We have designed a multivariate linear model for predicting protein half-life in tissue from its cellular properties. Inputs to the model are cellular halflife, abundance, intrinsically disordered sequences, and transcriptional and translational rates. Before the modeling, we determined substructures in the data using the relative distance from the regression line of the protein half-lives in tissues and cells, identifying three separate clusters. The model was trained on and applied to predict protein half-lives from murine liver, brain and heart tissues. In each tissue type we observed similar prediction patterns of protein half-lives. We found that the model provides the best results when there is a strong correlation between tissue and cell culture protein half-lives. Additionally, we clustered the protein half-lives to determine variations in correlation coefficients between the protein half-lives in the tissue versus in cell culture. The clusters identify strongly and weakly correlated protein half-lives, further improves the overall prediction and identifies sub groupings which exhibit specific characteristics. The model described herein, is generalizable to other data sets and has been implemented in a freely available R code.
AB - Protein half-life is an important feature of protein homeostasis (proteostasis). The increasing number of in vivo and in vitro studies using high throughput proteomics provide estimates of the protein half-lives in tissues and cells. However, protein half-lives in cells and tissues are different. Due to the resource requirements for researching tissues, more data is available from cellular studies than tissues. We have designed a multivariate linear model for predicting protein half-life in tissue from its cellular properties. Inputs to the model are cellular halflife, abundance, intrinsically disordered sequences, and transcriptional and translational rates. Before the modeling, we determined substructures in the data using the relative distance from the regression line of the protein half-lives in tissues and cells, identifying three separate clusters. The model was trained on and applied to predict protein half-lives from murine liver, brain and heart tissues. In each tissue type we observed similar prediction patterns of protein half-lives. We found that the model provides the best results when there is a strong correlation between tissue and cell culture protein half-lives. Additionally, we clustered the protein half-lives to determine variations in correlation coefficients between the protein half-lives in the tissue versus in cell culture. The clusters identify strongly and weakly correlated protein half-lives, further improves the overall prediction and identifies sub groupings which exhibit specific characteristics. The model described herein, is generalizable to other data sets and has been implemented in a freely available R code.
UR - http://www.scopus.com/inward/record.url?scp=85025092603&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85025092603&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0180428
DO - 10.1371/journal.pone.0180428
M3 - Article
C2 - 28719664
AN - SCOPUS:85025092603
SN - 1932-6203
VL - 12
JO - PloS one
JF - PloS one
IS - 7
M1 - e0180428
ER -