PMC:5197943 / 29182-37303
Annnotations
{"target":"https://pubannotation.org/docs/sourcedb/PMC/sourceid/5197943","sourcedb":"PMC","sourceid":"5197943","source_url":"https://www.ncbi.nlm.nih.gov/pmc/5197943","text":"3.2. Case Study\nIn this section, we will discuss K and M assessment in the context of overall survival before the event of interest, and how to read the results obtained from overall survival through a case study.\nGenerally, the purpose of the overall survival analysis is to employ the data available to provide assessment of the change of surviving to different times.\nClinical annotations are provided by the physicians and include overall survival, PFS, and response for each patient. These data are added to the DMET dataset (let see Table 1) to obtain the OS-dataset to correlate each SNP of the ADME genes to OS and PFS.\nThe OS-dataset (shown in Table 2) is achieved by adding clinical annotation (temporal clinical trends for each patient) that are: overall survival data (expressed in months) are collected from the starting point; for example, when the treatment starts or when the subject is enrolled into the study, to the end point that is, when the event of interest is reached i.e., dead. The occurrence of the event of interest is handled by using the Status-OS variable where 1 means the occurrence of the event of interest. Instead, 0 indicates censored data i.e., the subject is dropped out to the study for an unknown reason. PFS data (expressed in months) are collected for each subject, beginning when the subject starts the treatment and ending when the disease progresses or when the subject dies for any reason. PFS-Status variable takes a value of 1 that indicates the occurrence of the event of interest, whereas a value of 0 indicates a censored data. Finally, the response variable conveys the presence of metastasis when it assumes value equals to 1 and the absence of metastasis when it assumes value equals to 0.\nIn this way, OSAnalyzer can compute OS and PFS due to the presence/absence of SNP of the ADME genes for each probe, and, by using the log-rank test, it can compare and rank each SNP according to the p-value significance.\nThese results may help clinicians to understand if those SNPs may play a role in improving the response to cancer treatments and finally patients’ outcome.\nA K and M estimator provides a graph of the survival function that summarizes the time-related information. To illustrate the OS analyses by using OSAnalyzer we generated a synthetic dataset (which is randomly generated) of 80 patients affected by an advanced cancer as the basis of an observational study of this disease.\nThus, survival analysis uses information from the whole follow-up period allowing us to illustrate the important point that comparative analysis between OS-curves depends upon the area under the K and M curves (AUC) and not only on differences based on single points, especially in real clinical studies.\nThe first step of the K and M analysis concerns with the data collection and arrangement. Data arrangement is necessary to make data in an appropriate format expected from the chosen analysis tool. There exist plenty of statistical analysis tools, available under GNU General Public License such as OSAnalyzer (https://sites.google.com/site/overallsurvivalanalyzer/), PSPP, or proprietary software, such as SPSS and MATLAB, each one with its requirements in terms of data arrangement. All software cited above require that data be arranged in a tabular form, containing, at least, the following information: (i) serial times; (ii) status at serial time (1 event of interest; 0 censored); and (iii) other kinds of clinical data, such as response rate, istotype, sex, etc., as nominal variables.\nIn any case, before beginning the analysis of the data, it is necessary to choose the analysis tool. The choice must be made according to the type of data to analyze. For instance, to analyze an SNP DMET dataset with SPSS, the user is required an extra effort to convert each single SNP “A/A, A/T, ...” in numerical values, given the impossibility of SPSS to analyze string values. Such conversion must be done manually by the user, increasing the probability of introducing errors due to the manual translation. It is worthy to note here that the translated file is necessary even for the PSPP software tool. To avoid this expensive step, OSAnalyzer can automatically analyze such SNP DMET datasets, and, most importantly, provide to the users all of the results ranked accordingly to the statistical significance of log-rank test.\nTo illustrate how this all works, we prepared a synthetic OS-dataset extended with temporal data related to the subjects in each of the three groups related to the 3 allele variants in each probe (total of 80 subjects). The event of interest is “death” represented by the symbol 1. To understand the K and M curves let us look to Figure 7.\nThe lengths of the horizontal lines along the X-axis represent the duration of the survival time. The slopes indicate the end of an interval, due to the occurrence of the event of interest. The vertical lines have an aesthetic function only because vertical lines make the curve more pleasing to observe. Although the primary function of vertical lines is aesthetic, the distance between horizontal lines is crucial because they convey the change in cumulative probability. The following is an example of how the points of a survival curve could be roughly interpreted. Let us start to analyze the cumulative probability of surviving a given time could be read on the Y-axis. For example, the probability of surviving 28 months of the patients in the group labeled “T/T” is 60%; conversely, the probability of surviving the same time for patients belonging to the groups “C/T” and “C/C” is slightly more than 90%. It is worthy to note that the steepness of the curve is due to the absence of the event of interest (that is the length of horizontal lines). The censored patients are another element that impacts the survival point. Censored patients are indeed represented as tick marks on the survival curves. Censored values impact the cumulative probability of the groups under investigation. In details, the fourth and the fourteenth censored patients (represented by the ticks on the curve) into the “C/T” and “T/T” groups respectively, contribute in reducing the survival probability to live at least 28 months. Whereas, the fifth censored patients into the “C/C” group did not change the survival probability to live 28 months. However, the censored values contained in the three groups impact on reducing the cumulative survival among the intervals. Hence, we must be careful in interpreting anything beyond this point because our temporal data does not allow to extrapolate any further hypothesis on survival. It is worthy to note that intervals (horizontal lines) in the K and M curve are constructed only for the events of interest and not for the censored patients. As stated, this is conveyed in Figure 7 by means of the corner joining horizontal with vertical segments. Thus, in group “C/T”, “C/C”, and “T/T”, there are four, three, and nine events (vertical connections between the end of one interval and beginning of the next) demarcating five, four, and ten intervals (horizontals), respectively. It is worthy to note that there are no vertical changes due to the censored patients. Moreover, Figure 7 highlights in a remarkable way the capability of the K and M method to deal with variable intervals.\nThe comparison of survival curves is the most important step in all medical oncology clinical trial studies. The shape of the curve is important to evaluate. Curves that have many small steps usually have a higher number of participating subjects, whereas curves with large steps usually have a limited number of subjects and are, thus, less accurate. Whereas it is simple to visualize the difference between two survival curves, the difference must be quantified to assess statistical significance. The log-rank test and hazard ratio are the most common methods used for comparing survival curves. In detail, the log-rank test suggests whether two curves are statistically different, whereas the hazard ratio shows the increased rate of having an event in one curve versus the other.","divisions":[{"label":"Title","span":{"begin":0,"end":15}}],"tracks":[]}