SAS統計分析HW



   Please describe the correlation between age (AGE5) and BMI (CBMI5).

兩者是連續變項,計算皮爾森相關係數,相關係數 0.06248

   Please try to use a regression to predict BMI (CBMI5) from age (AGE5).




    Please use the logistic regression model to estimate the effect between cigarette smoking (SMOKE5) and cardiovascular disease (CVD).

     兩者皆是分類變項。抽菸情況下得估計值0.2497 PValue 0.0578
      Please use the logistic regression model to estimate the effect between each age group (AGE5) and cardiovascular disease (CVD).
Note: please categorize age in to three group. (Ex, age<50, 50-60, 60+)
先將AGE5分成三類,在進行羅吉斯回規模型分析。


   We are interested in comparing people on hypertension treatment (trtbp5) with those who are not. Consider the set surv.xls dataset used before. Variable cvd is an event indicator (1 if event and 0 if censored) and variable timetocvd denotes the follow-up.
      Look at the distribution of timetocvdseparately for people with and without cvd. Describe what you see focusing on the mean, median, minimum and maximum in each group.


   Report 15-year Kaplan-Meier rates of CVD for treated and non-treated for hypertension.


 Construct 95% confidence intervals for the rates reported in b and based on these determine if the two groups differ significantly.  有顯著差異。
  Run a Cox proportional hazards regression with onset of cardiovascular disease as the outcome and hypertension treatment as the exposure of interest. Report and interpret the hazard ratio and its 95% confidence interval and determine if it is significantly different than 1. 


a.      Run a Cox proportional hazards regression with onset of cardiovascular disease as an outcome and hypertension treatment as the exposure of interest, adjusting for age5 male sbp5smoke5 tcl5 hdl5 trig5 cbmi5 pai5. Report and interpret the hazard ratio and its 95% confidence interval and determine if it is significantly different than 1.


*import dataset;
DATA A ;
Set work.B;
RUN;

/*5 a. Look at the distribution of timetocvd separately for people with and without cvd. Describe what you see focusing on the mean,
median, minimum and maximum in each group.*/ ;

Proc UNIVARIATE  Data=A ;
CLASS CVD;
VAR   timetocvd ;
RUN;

/*5 b. Report 15-year Kaplan-Meier rates of CVD
for treated and non-treated for hypertension. */;

proc lifetest data=A PLOTS=(s) TIMELIST=1,2,3,4 ;

PROC LIFETEST DATA=A METHOD=KM PLOTS=(S);
TIME timetocvd*TRTBP5(0);
RUN;

/*5  c. Construct 95% confidence intervals for the rates reported in b and
based on these determine if the two groups differ significantly. */;

PROC LIFETEST DATA=A METHOD=KM PLOTS=(S);
TIME timetocvd*TRTBP5(1);
RUN;

/*5 d. Run a Cox proportional hazards regression with onset of
cardiovascular disease as the outcome and hypertension treatment as
the exposure of interest. Report and interpret the hazard ratio and its 95%
confidence interval and determine if it is significantly different than 1.  */ ;

PROC PHREG DATA=A;
MODEL timetocvd* CVD(0)=TRTBP5/risklimits;
RUN;

/*5 e. Run a Cox proportional hazards regression with onset of cardiovascular disease
as an outcome and hypertension treatment as the exposure of interest,
adjusting for age5 male sbp5 smoke5 tcl5 hdl5 trig5 cbmi5 pai5.
Report and interpret the hazard ratio andits 95% confidence interval and determine if it is significantly different than 1. */ ;

PROC PHREG DATA=A;
MODEL  timetocvd*CVD(0)=TRTBP5 age5 sex sbp5 smoke5 tcl5 hdl5 trig5 cbmi5 diab5
 /risklimits;
RUN:

留言