Auspar attachment 3: Extract from the Supplementary Clinical Evaluation Report for Daclizumab



Yüklə 342,49 Kb.
səhifə5/9
tarix03.05.2018
ölçüsü342,49 Kb.
#41059
1   2   3   4   5   6   7   8   9

7.1.2.Study 205MS301


Study abstract: A Multicenter, Double-blind, Randomized, Parallel-group, Monotherapy, Active-control Study to Determine the Efficacy and Safety of Daclizumab High Yield Process (DAC HYP) versus Avonex (Interferon β-1a) in Patients with Relapsing-Remitting Multiple Sclerosis.
7.1.2.1.Study design, objectives, locations and dates

This Phase 3 study compared DAC HYP at the proposed dose (150 mg SC every 4 weeks) with an active control, IFN β-1a (tradename: Avonex) intramuscularly (IM) weekly, in subjects with RRMS, using a randomised, double-blind, parallel-group design. The study was reasonably large (n = 1841) and it had an acceptable duration (up to 144 weeks), so it can therefore be considered the main pivotal study of the submission. The study lacked a placebo group, but the Phase 2 study 205MS201 provided placebo-controlled data and, in combination, the two studies provide a reasonably clear assessment of the efficacy of DAC HYP.

The primary objective of the study was ‘to test the superiority of DAC HYP compared to IFN β-1a in preventing MS relapse in subjects with RRMS.’

Secondary objectives were ‘to test the superiority of DAC HYP compared with IFN β-1a in slowing functional decline and disability progression and maintaining quality of life in this subject population.’

Additional objectives were to assess other long-term efficacy measures including neurological function and brain atrophy, to assess safety and tolerability, to gather pharmacokinetic (PK) data, and to study the effect of DAC HYP on pharmacodynamic (PD) markers.

The study was conducted at 246 sites in 28 countries (Argentina, Australia, Brazil, Canada, Czech Republic, Denmark, Finland, France, Georgia, Germany, Greece, Hungary, India, Ireland, Israel, Italy, Mexico, Moldova, Poland, Romania, Russia, Serbia, Slovenia, Spain, Sweden, Switzerland, Ukraine, United Kingdom, the United States of America) and ran from 11th May 2010 (first treatment) to 5th March 2014.

The most important inclusion and exclusion criteria, as listed in the study synopsis, are reproduced below. The complete list of formal entry criteria included more detailed restrictions based on concomitant diseases and abnormal laboratory tests at baseline

The inclusion criteria were similar to the other pivotal study, 205MS201, and attempted to restrict the study to subjects with active RRMS and no major progression.

The indicators of active disease were similar in both studies, but Study 205MS301 (discussed here) required 2 relapses in the last 3 years (or radiological evidence of multiple relapses) as well as 1 relapse in the last 12 months. EDSS restrictions were identical to the previous study (0.0 to 5.0 inclusive), and pose the same difficulty of interpretation in that subjects with higher EDSS may have had some progression between relapses, and therefore may have had SPMS or relapsing progressive MS. As noted in the discussion of Study 205MS201, SPMS and RPMS were notionally listed as exclusion criteria, but their definitions involved 3 months of continuous worsening, which may be very difficult to identify in clinical practice.

The requirement for at least 2 relapses in the previous 3 years (or a radiological substitute for clinical relapses) means that, despite the potential inclusion of some subjects with an element of progression, the cohort studied clearly had active, relapsing disease. The protocol-specified exclusion of subjects with SPMS and PRMS means that the study was focussed on the frequently relapsing, minimally-progressive end of the MS spectrum. It is therefore not known whether the benefits observed in this study would be reproduced in subjects with only one relapse in the previous 3 years, or in subjects where definite progression was present. This means that the proposed indication in the PI is too broad, as already discussed in the context of Study MS201 (see Section: Study 205MS201; Study design, objectives, locations and dates). This issue is addressed further in the discussion of the sponsor’s response to key EMA Questions and in suggested edits to the PI.

Inclusion criteria

As per study report, the main inclusion criteria were:

Ability to understand the purpose and risks of the study and provide signed and dated informed consent and authorisation to use protected health information in accordance with national and local subject privacy regulations.



Must have been 18 to 55 years of age, inclusive, at the time of consent.

Must have had a confirmed diagnosis of RRMS, as defined by McDonald criteria 1 through 4.

Must have had an EDSS score between 0.0 and 5.0, inclusive.

Must have experienced 2 or more clinical relapses within the previous 3 years, with at least 1 clinical relapse having occurred within the 12 months prior to randomisation or 1 or more clinical relapses and 1 or more new MRI lesions (Gd-enhancing and/or T2 hyperintense lesion) within the previous 2 years and with at least 1 of these events in the 12 months prior to randomisation. The new MRI lesion must have been distinct from one associated with the clinical relapse. The baseline MRI could be used to satisfy this criterion.

Women of childbearing potential must have been willing to practice effective contraception during the study and been willing and able to continue contraception for 4 months after their last dose of study treatment.’
Exclusion criteria

The main exclusion criteria were:

Diagnosis of primary progressive, secondary progressive or progressive relapsing MS. These conditions required the presence of continuous clinical disease worsening over at least 3 months. Subjects with these conditions may also have had superimposed relapses, but were distinguished from subjects with RRMS by the lack of clinically stable periods or clinical improvement.



Known intolerance, contraindication to, or history of noncompliance with IFN-β (Avonex) 30 μg. (Note: Current or prior use of an approved IFN-β preparation for MS, including Avonex, was allowed as long as the subject was appropriate for IFN-β treatment according to local prescribing information).

History of abnormal laboratory results that, in the opinion of the investigator, were indicative of any significant cardiac, endocrine, haematological, hepatic, immunologic, metabolic, urologic, pulmonary, gastrointestinal, dermatologic, psychiatric, renal, neurological (other than MS), and/or other major disease that would have precluded administration of DAC HYP or IFN-β.

An MS relapse that had occurred within the 50 days prior to randomisation and/or the subject had not stabilised from a previous relapse prior to randomisation.

Any previous treatment with daclizumab or other anti-CD25 monoclonal antibody.

Prior treatment with mitoxantrone, cyclophosphamide, fingolimod, or natalizumab within 1 year prior to randomization.’
7.1.2.2.Study treatments

Subjects were randomised in a 1:1 ratio to (DAC HYP) or IFN β-1a (Avonex).

  • DAC HYP was administered at the proposed dose of 150 mg SC 4-weekly.

  • IFN β-1a was supplied as Avonex in a pre-filled syringe (PFS) for IM injection. Each 0.5 mL of comparator study drug contained 30 μg of IFN β-1a, and was administered once weekly. This is the standard registered dose for Avonex.

Subjects received study treatment in this study for up to 144 weeks. The study duration was described by the sponsor as ‘3 years’ but this would require a treatment period of 156 weeks. By design, as the study was permitted to end when the last enrolled subject had received treatment for 96 weeks, some subjects enrolled later in the study received treatment for less than 2 years, even without considering those who terminated prematurely. The PI describes the study duration as ‘a minimum of 2 to a maximum of 3 years (96 to 144 weeks).’ This is not accurate, and the statement should be corrected.

It should be noted that, although Avonex is a registered active treatment for MS, it is considered by many neurologists to be less effective than other active treatments, and thus represents a soft target for head-to-head trials. Some direct head-to-head trials have suggested superiority of other interferon therapies relative to Avonex.6, 7 It therefore appears plausible that DAC HYP might have shown less relative benefit if compared with a different active therapy.

The use of concomitant medications was restricted, as described previously for Study MS201: methylprednisolone was permitted for relapses, symptomatic treatments were stabilised prior to the randomised study period, where possible, and disease-modifying agents were prohibited.

7.1.2.3.Efficacy variables and outcomes

The study synopsis listed the following efficacy variables:

Clinical outcomes:



Clinical relapses.

EDSS

Subject global assessment (as measured by the QoL questionnaires: EQ-5D and MSIS-29)

Multiple Sclerosis Functional Composite (MSFC)

Visual Function Test (VFT)

Symbol Digit Modalities Test (SDMT)

Relapses that were determined to meet protocol-defined criteria were subsequently evaluated by the (INEC)

Brain MRI outcomes:

Total number of new Gd-enhancing lesions

New or newly enlarging T2 hyperintense lesions

New T1 hypointense lesions

Volume of T2 hyperintense lesions

Volume of T1 hypointense lesions

Brain atrophy.’
Primary efficacy endpoint

The primary efficacy endpoint was the adjusted ARR based on INEC-confirmed relapses. This is the same primary endpoint as the other pivotal DAC HYP study, and similar relapse-based primary endpoints have been used for most registration studies for disease-modifying agents in MS.
Secondary efficacy endpoints

Secondary efficacy endpoints were listed as follows, in rank order:

Number of new or newly enlarging T2 hyperintense lesions on brain MRI over 96 weeks

Proportion of subjects with confirmed disability progression defined by at least a 1.0-point increase on the EDSS from a baseline EDSS ≥ 1.0 that was sustained for 12 weeks or at least a 1.5-point increase on the EDSS from a baseline EDSS = 0 that was sustained for 12 weeks

Proportion of subjects who were relapse free

Proportion of subjects with a ≥7.5-point worsening from baseline in the MSIS-29 Physical Impact score at 96 weeks

These endpoints are reasonable. The EDSS is a standard measure of disability in MS, and it has been widely used and validated. It was used in this study as a baseline stratification measure, and it was used to define disease progression. The EQ-5D is a validated QoL measure used in many major efficacy studies.

The MSIS-29 is a validated, MS-specific QoL measure that was also used in Study 205MS201. It includes 2 sub-scales: the 20-item Physical Impact scale and the 9-item Psychological Impact scale. Increased scores represent worsening from baseline and decreased scores represent improvement. In validation studies, a change of ≥ 7.5 points was considered to be clinically meaningful.8,9 This study used a ≥ 7.5 point change on the Physical subscale as a secondary endpoint.

The MSFC is a validated measure of disability that can be used as an alternative to the EDSS; it is based on aggregate performance in a number of tasks assessing walking, upper limb function and vision. MSFC scores are scaled by the standard deviation of raw scores obtained in a control group, so the resulting scores are somewhat abstract, difficult to read at a glance, and cannot be applied in isolation to a single patient. These are some of the reasons that the MSFC has not replaced the EDSS, which remains the gold standard measure of disability in MS.

The VFT and SDMT are validated, task-specific tests looking at a subset of neurological skills; they are not relevant unless subjects develop deficits in the domains under consideration, so they are only useful as minor endpoints.

The radiological outcomes listed above are standard objective measures of disease activity, with Gd lesions representing reasonably specific evidence of recent inflammatory activity, and hence active plaques. T2 lesions are the hallmark of MS, but require comparison with old scans to determine whether they are new or recent. T1 hypointense lesions represent focal loss of brain tissue, particularly axons, and correlate with permanent disability. Brain atrophy worsens throughout the course of the disease, and corresponds with long-term disease activity and cognitive decline; one issue posed by interpreting atrophy is that a reduction in inflammation in the brain may cause a reduction in volume, known as pseudoatrophy, and this may mask relative changes in the progression of true atrophy.

Although MRI outcomes are objective, they are usually treated as secondary endpoints because it is possible that a treatment might improve MRI measures without an associated clinical correlate, and such a treatment would not be clinically useful.

7.1.2.4.Randomisation and blinding methods

Randomisation was performed with an IVRS, and randomisation codes were not made available to patients or treating or rating clinicians. Randomisation to each of the two treatment arms was performed with a 1:1 ratio.

Blinding was attempted by using a double-dummy approach, with a placebo for IFN-β-1a and a placebo for DAC HYP, each administered with the same dosing regimen as the corresponding active treatment. There was appropriate separation of the treating and rating neurologists, and the reporting radiologists, as described for Study 205MS201.

IFN-β-1a, like other interferon beta therapies, is associated with a number of ‘tell-tale’ side effects including injection site reactions (ISRs) and flu-like symptoms, also characterised as influenza-like illness. The IM approach means that ISRs are usually much less evident with Avonex than with some other IFN therapies, which are administered subcutaneously. Skin reactions were more commonly seen in the DAC HYP group, as acknowledged by the sponsor:

In Study 205MS301, the incidence of cutaneous events by system organ class (SOC) was higher in the DAC HYP group (37%) than in the IFN β-1a group (19%)’.

These percentages do not appear to have included injection site reactions:

The most common (cutaneous) events (≥ 2%) by preferred term in the DAC HYP group were rash (7%); eczema (4%); seborrheic dermatitis, acne, erythema, and pruritus (3% each); and dermatitis, dermatitis allergic, dermatitis contact, dermatitis atopic, rash maculopapular, dry skin, alopecia, urticaria, and psoriasis (2% each).’

Injection site-related AEs were nonetheless common:

The overall incidence of AEs at the injection site was similar between the 2 groups (18% IFN β-1a versus 17% DAC HYP), as were the most common injection site AEs: injection site pain (11% versus 10%), injection site erythema (5% versus 4%), and injection site bruising (3% versus 2%).’

The incidence of ISRs appeared broadly balanced across the two groups, but the fact that the two treatments used different dosing sites and regimens means that any reaction at the site of an active injection is likely to have led to unblinding of the patient. Injections sites were covered during assessments by the rating neurologist (responsible for EDSS and relapse assessments), so this is not expected to have had a substantial effect on major efficacy endpoints.

The potential for unblinding due to FLS/ILI was anticipated by the sponsor, and some attempt was made to minimise this problem. Subjects received prophylactic treatment for FLSs, described in the study report as follows:

In order to relieve flu-like symptoms for the first 24 weeks of study treatment dosing, all subjects were instructed to take acetaminophen (paracetamol) or ibuprofen or other nonsteroidal, anti-inflammatory drugs (NSAIDs) such as naproxen or aspirin prior to each Avonex (or matching placebo) injection and for the 24 hours after each injection at the recommended dose and frequency per the local labels. Additional doses of these protocol-designated products could be taken after 24 hours post-injection within the maximum daily dose recommended per local labels. After 24 weeks, the products could be discontinued at the discretion of the investigator.’

It is unlikely that these measures were sufficient to prevent some unblinding. In usual neurological practice, many subjects receiving IFN report flu like symptoms despite the use of prophylactic medications, and some subjects have ongoing flu like symptoms for more than 24 hours after each injection. It seems likely that, even if the prophylactic treatment was 100% effective in masking flu-like symptoms, subjects would occasionally forget doses (or omit them on purpose to determine if they were necessary). Furthermore, the protocol allowed subjects to cease prophylactic agents at 24 weeks, at which time it is possible that subjects who were receiving IFN would be exposed to flu-like symptoms for the first time even if they had enjoyed 100% mitigation of FLSs with prophylactic agents prior to that. The fact that the protocol allowed prophylactic agents to be dropped at 24 weeks ‘at the discretion of the investigator’ highlights the fact that flu-like symptoms and the adequacy and necessity for prophylaxis were explicitly discussed by the patient and the treating neurologist, leading to potential unblinding of the neurologist as well as the patient.

An assessment of the incidence of influenza like illness, reported as an AE shows that it was seen in a substantial portion of IFN β-1a recipients (38%), and was much less commonly observed in DAC HYP recipients (10%). According to the sponsor:

Across the study period, 346 subjects in the IFN β-1a group and 88 subjects in the DAC HYP group reported at least 1 event of influenza-like illness.’

The risk that this tell-tale side effect could have led to unblinding would have been increased by the fact that the two drugs had different dosing schedules. A patient who experienced flu-like symptoms after every weekly IM injection could easily guess they were receiving IFN β-1a, and subjects who experienced similar symptoms every 4 weeks after a SC injection could guess they were receiving DAC HYP.

The evaluator found no evidence that the sponsor took steps to assess the extent of unblinding (this could have been achieved by asking subjects and neurologists to guess the treatment assignment at the end of the study).

A digital search of the case study report for ‘unblinding’ reveals that there were 6 instances of accidental blinding due to logistical errors in which treatment assignments were revealed, but there is no mention of any attempt to quantify the extent of unblinding due to side effects. This is a substantial methodological flaw in the study; on the other hand, the sponsor performed sensitivity analyses in subjects with and without flu like symptoms, and this analysis indirectly suggests that inadvertent unblinding, although likely to be present in a substantial number of patients, did not play a large role in determining the outcome.

7.1.2.5.Analysis populations

The sponsor defined three analysis populations.

The ITT population included all randomised subjects who received at least 1 dose of any study treatment, analysed according to the group to which they were randomised.

The per-protocol population was defined as all subjects from the ITT population who satisfied the following conditions:

Met both inclusion criteria related to MS-specific disease activity:

Had a confirmed diagnosis of RRMS according to McDonald criteria 1-4 and a cranial MRI demonstrating lesion(s) consistent with MS.

Had a baseline EDSS between 0.0 and 5.0, inclusive.

Compliant with study treatment: ≥ 90% of DAC HYP or Avonex doses up to Week 96.

Did not permanently discontinue study treatment prior to Week 96.

The safety population comprised all subjects who received at least 1 dose of study any treatment.

For the primary efficacy analysis and most secondary endpoints, the main analysis was performed on the ITT Population, and subjects were analysed in the group to which they were randomised.

For the number of new or newly enlarging T2 lesions on MRI, the analysis was based on the subset of subjects with a non-missing post-baseline assessment.

The per-protocol population was used for sensitivity analyses, and all major efficacy endpoints were reassessed in this population, generally producing results consistent with the ITT analysis.

For the safety analysis, subjects in the Safety Population were analysed according to the treatment they actually received.

Overall, the Sponsor’s approach to these analysis populations was appropriate.


7.1.2.6.Statistical methods

Statistical methods in Study 205MS301 were similar to those in Study 205MS201, and they were broadly appropriate apart from a number of ‘sensitivity analyses’ that used questionable imputation methods to reanalyse the progression data, as well as a potentially misleading calculation of relative risk reduction for the proportion of subjects experiencing a relapse. These issues are discussed in more detail in the statistical methods and efficacy results of the previous study above.

The primary efficacy endpoint was the adjusted ARR based on INEC-confirmed relapses. The analysis of this endpoint included data from all ITT subjects until they completed the End of Treatment Period Visit, switched to alternative MS medication, or withdrew from the study. The difference in ARR between DAC HYP 150 mg and IFN β-1a was assessed with a negative binomial regression model, adjusting for baseline relapse rate, history of prior IFN β-1a use, baseline EDSS (≤ 2.5 versus > 2.5) and baseline age (≤ 35 versus > 35 years). This is very similar to Study 205MS201 but with one additional adjustment factor (history of IFN β-1a use).

Analysis methods for secondary endpoints included:

negative binomial regression (for number of T2 hyperintense lesions)

Cox proportional hazards and Kaplan-Meier product limit estimator (for disability progression as measured by an increase in EDSS score, and for proportions of subjects who were relapse free)

logistic regression (for proportion of subjects with a ≥ 7.5-point worsening in the MSIS-29 Physical Impact score).

The main analyses of efficacy endpoints excluded data after subjects switched to alternative MS medications, but the sponsor performed additional sensitivity analyses that included data after switching.

To control for multiplicity of secondary endpoints, a closed testing procedure was used. Endpoints were ranked in terms of priority, and if statistical significance was not achieved for an endpoint, all endpoints of a lower rank were considered not statistically significant. The pvalues presented were nominal results, not adjusted for multiplicity.

Tertiary endpoints did not include adjustments for multiple comparisons and endpoints.

7.1.2.7.Sample size

Sample size estimations were based on the primary endpoint ARR. Power was estimated from simulations that assumed a 21% drop-out rate, an average of 2.4 years of follow-up, and an ARR of 0.27 in the IFN β-1a group. With these assumptions, it was estimated that a sample size of 900 subjects per treatment group would have approximately 90% power to detect a 24% reduction in the ARR between the IFN β-1a treatment group and the DAC HYP treatment group, based on a negative binomial regression model with a standard 5% type 1 error rate (≤ 0.05). These calculations suggested that approximately 1800 subjects were required for the study and this target was exceeded (n = 1841).

Overall, these assumptions appear plausible, and the study showed itself to be adequately powered for this endpoint.

The study was not specifically powered for the key secondary endpoint of disability progression, and failed to show a significant benefit despite a slightly favourable trend.

7.1.2.8.Participant flow

Patient disposition is summarised in Figure 7 below. Completion rates were similar in both treatment groups (75% for IFN β-1a and 79% for DAC HYP) and were acceptable for a large, long study of this nature. The most common reasons for withdrawal were AEs, apparent lack of efficacy, and withdrawal of consent. The reasons were broadly balanced across the two treatment groups, making it relatively unlikely that the study experienced major withdrawal bias. There was a slight excess of IFN β-1a subjects withdrawing due to a perceived lack of efficacy (7% compared to 3% in the DAC HYP group).

Figure 7. Subject disposition and participant flow, Study 205MS301



figure 7. subject disposition and participant flow, study 205ms301 completion rates were similar in both treatment groups (75% for ifn β-1a and 79% for dac hyp) and were acceptable for a large, long study of this nature. the most common reasons for withdrawal were aes, apparent lack of efficacy, and withdrawal of consent. the reasons were broadly balanced across the two treatment groups, making it relatively unlikely that the study experienced major withdrawal bias. there was a slight excess of ifn β-1a subjects withdrawing due to a perceived lack of efficacy (7% compared to 3% in the dac hyp group).
7.1.2.9.Major protocol violations/deviations

Major protocol deviations were summarised by the sponsor as follows:

'Overall, the incidence and category of major protocol deviations were similar between the two treatment groups. The most common major deviations (≥ 20%) were ‘Informed Consent’ (32% IFN β-1a versus 33% DAC HYP), ‘Key Study Procedures’ (28% IFN β-1a versus 27% DAC HYP), and ‘Other’ (27% (for) each group).’

This is suggestive of a high level of protocol deviations, though the description does not clearly indicate the nature of the deviations. The sponsor’s text provided a link to a table of deviations, reproduced below, but this also lacked sufficient detail and it is not possible to determine whether the deviations related to ‘Key Study Procedures’ or ‘Other’ substantially compromised the study. The sponsor should be asked to clarify this issue.

Table 11. Summary of major protocol deviations, Study 205MS301

table 11. summary of major protocol deviations, study 205ms301

7.1.2.10.Baseline data

The two treatment groups were well matched in terms of demographics. Although this was not demonstrated in a convenient table within the study report, the clinical overview included a one-page table covering key aspects of the demographics of both pivotal studies. (The relevant sections for Study 205MS301 are the last two columns of Table 12 below). Both treatment groups had a similar gender distribution, mean age, and racial mix (not shown in the table).

The treatment groups were also reasonably matched for baseline disease characteristics, including mean years since diagnosis (slightly longer in the DAC HYP group), relapses in the last 3 years (≤ 2 for 57% of patients in both groups, ≥ 3 for 47% of patients in both groups), relapses in the last 12 months, mean EDSS scores (close to 2.5 in both groups), and MRI lesion counts.

Overall, the balance between treatment groups was acceptable, and the results are unlikely to have been significantly influenced by unequal risks at baseline. Furthermore, the population studied was reasonably representative of the population likely to be considered for treatment with DAC HYP.

Table 12. Demographics and baseline disease characteristics (Study 205MS201/301)



table 12. demographics and baseline disease characteristics (study 205ms201/301
7.1.2.11.Results for the primary efficacy outcome

The study achieved a significant positive result for its primary endpoint with an adjusted ARR in the IFN β-1a group of 0.393 relapses/year, and a rate of 0.216 relapses/year in the DAC HYP group (95% CI: 0.353 to 0.438 in the IFN β-1a treatment group and 0.191 to 0.244 in the DAC HYP treatment group). Unadjusted ARRs were broadly similar (0.353 and 0.212 for IFN β1a and DAC HYP, respectively).

These results correspond to a relative reduction of 45% in ARR (p < 0.0001) with DAC HYP, compared to IFN β-1a. The rate ratio of ARRs was 0.550 (95% CI: 0.469 to 0.645), indicating that a reduction in relapse rate of at least 35% could be expected with DAC HYP (based on the pessimistic upper limit of the 95% CI for the rate ratio) compared to an active treatment that has been shown to be superior to placebo. These are strong results for a head-to-head study and show a clear, clinically meaningful benefit with DAC HYP in reducing relapses. The expected benefit over placebo is not easily estimated from these figures, but would be expected to be better than the observed benefit over weekly IFN β-1a, and consistent with the results of Study 205MS201 (where a 54% reduction in ARR was observed for the 150 mg dose).

Figure 8. Annualised relapse rates, Study 205MS301

figure 8. annualised relapse rates, study 205ms301

Similar results were obtained in all pre-specified subgroups based on demographics and baseline disease characteristics (discussed under sub-analyses for this study).

Sensitivity analyses of this result, using different statistical approaches as shown in the figure below, produced broadly concordant results and suggest that the effect of DAC HYP on ARR was statistically and methodologically robust. In the per protocol population, the results were slightly inferior to the ITT results, but still consistent with a clear head-to-head benefit over IFN β-1a: DAC HYP reduced the ARR by 39% relative to IFN β-1a (rate ratio: 0.606 (95% CI: 0.508 to 0.724); p < 0.0001).

Figure 9. Annualised relapse rate: summary of primary and sensitivity analyses, Study 205MS301



figure 9. annualised relapse rate: summary of primary and sensitivity analyses, study 205ms301

In a related analysis, restricted to severe or serious relapses, a broadly similar proportional benefit was observed, further supporting the conclusion that the reduction in relapses was clinically meaningful: DAC HYP produced a 38% reduction in severe or serious MS relapses compared with INF β-1a (p = 0.0021).


7.1.2.12.Results for other efficacy outcomes

The study specified four secondary efficacy endpoints, which were ranked as follows:

  1. Number of new or newly enlarging T2 hyperintense lesions on brain MRI over 96 weeks

  2. Progression of disability as measured by EDSS score

  3. Proportion of subjects free from relapse

  4. Proportion of subjects with a ≥ 7.5-point worsening from baseline in the MSIS-29 Physical Impact score at Week 96
T2 hyperintense lesions

The number of new or newly enlarging T2 hyperintense lesions at Week 96 was significantly and substantially reduced by DAC HYP, relative to IFN β-1a: the adjusted mean lesion count was 9.44 (95% CI: 8.46 to 10.54) in the IFN β-1a treatment group and 4.31 (95% CI: 3.85 to 4.81) in the DAC HYP treatment group. This amounts to a reduction of 54.4% (95% CI: 46.9% to 60.8%; p < 0.0001) with DAC HYP. Broadly similar results were obtained with a variety of sensitivity analyses (not shown in this evaluation report).
Progression of disability

Disability, as measured by the EDSS, was assessed at baseline and at all study visits throughout the treatment period. Progression was defined as a ≥ 1.0-point increase on the EDSS (or a ≥ 1.5 point increase on the EDSS from a baseline EDSS of 0) sustained for 12 weeks. The primary method of comparing treatment groups was based on a Cox proportional hazards model, adjusted for baseline EDSS (EDSS ≤ 2.5 versus EDSS > 2.5), history of prior IFN β use, and baseline age (age ≤ 35 versus age > 35 years).

By the primary prospectively specified analysis method, there was no significant difference between the groups: the hazard ratio for confirmed progression was 0.84 (DAC HYP/IFN β-1a), but the 95% CI included the possibility that progression was increased with DAC HYP (95% CI: 0.66 to 1.07).

The PI contains the following description:

Zinbryta treated patients had a relative risk reduction in 12 week and 24 week confirmed disability progression of 16%, (95% CI: 7% to 34%; p = 0.16) and 27% (95% CI: 2% to 45%; p = 0.03) respectively compared to interferon beta-1a (IM) treated patients.’

This statement acknowledges that the results were not uniformly significant, but it fails to acknowledge that 12 week confirmed progression was a higher ranking endpoint than 24 week confirmed progression. The sponsor’s study report was somewhat less clear, reporting that: ‘DAC HYP reduced the risk of disability progression by 16% (p = 0.1575) compared with IFN β-1a’. It is important to note that the reported reduction of 16% is merely the central estimate of an uncertain range that included a 7% increase in progression.

As mentioned earlier in this report, relative risk estimates derived from hazard ratios do not necessarily reflect those derived from the actual proportions reaching a hazardous endpoint. For this particular endpoint, the distinction was numerically minor. The 16% reduction in risk cited by the sponsor has presumably been derived directly from the estimated hazard ratio of 0.84; it nonetheless appears to be consistent with a direct comparison of the overall adjusted progression rates at 96 weeks. As shown in Table 13 below, the adjusted proportions of progressed patients in each group at 144 weeks were 0.203 and 0.162 for the IFN β-1a and DAC HYP groups, respectively. This means that, at the 144-week time point, the DAC HYP progression rate was 79.8% (0.162/0.203) of the IFN β-1a rate. At the 96-week time point, the progression rate was 83.9% of the IFN β-1a rate (0.120/0.143), consistent with a 16% reduction.

Table 13. Time to 12-week sustained EDSS progression, Study 205MS301

table 13. time to 12-week sustained edss progression, study 205ms301

The sponsor performed a number of ‘sensitivity analyses’ of these results. The first of these to be described in the study report was as follows:

In the primary analysis of 12-week confirmed disability progression, all subjects who had a tentative disability progression and did not have an available confirmatory assessment were assumed to be non-progressors and were censored at the time of the last assessment. A pre-specified sensitivity analysis of 12-week confirmed disability progression was performed based on the alternative assumption that confirmed disability progression would occur at a similar rate as that for subjects who completed the confirmatory assessment in the trial (after adjustment for treatment group, baseline EDSS, change in EDSS at time of tentative progression, and presence of a relapse within the 29 days prior to the tentative progression). In this analysis, DAC HYP reduced the risk of 12-week confirmed disability progression by 21% as compared with the IFN β-1a group (hazard ratio (DAC HYP/IFN β-1a) of 0.79 (95% CI: 0.62 to 1.00; p = 0.0469))’.

This approach does not appear justified. By taking the rates of confirmed progression in each group and applying them (without supporting evidence) to additional cases of unconfirmed progression, the sponsor has allowed each case of confirmed progression to be counted as more than a single case: it is first counted as a confirmed case, and then its occurrence additionally affects the assumed rate of confirmation in different cases that did not actually reach confirmation. This double-accounting artificially inflates the statistical power of the analysis, and leads to a nominally significant p-value that does not reflect the true statistical uncertainty in the data.

Similar reasoning applies to another ‘sensitivity analysis’ described in the same paragraph of the clinical study report:

An additional pre-specified sensitivity analysis was carried out in which all tentative progressions with no confirmation assessment were assumed to be confirmed… In this analysis, DAC HYP also significantly reduced the risk of 12-week confirmed progression by 24% compared with the IFN β-1a group (hazard ratio (DAC HYP/IFN β-1a) of 0.76 (95% CI: 0.61 to 0.95; p = 0.0157)).’

In this analysis, cases of progression would clearly be contaminated by the inappropriate inclusion of relapses, because the analysis method simply assumes that unconfirmed worsening in EDSS is always sustained. This circumvents the methodological processes originally designed to assess progression without having the assessment confounded by relapses.

The sponsor provided one additional analysis purporting to show that progression was significantly reduced:

Additional related analyses also supported a significant treatment effect of DAC HYP in preventing disability progression compared with IFN β-1a. The risk of treatment failure (defined as the earliest of sustained progression of disability (at least a 1.0-point increase on the EDSS from a baseline EDSS ≥ 1.0 or at least a 1.5-point increase on the EDSS from a baseline EDSS of 0 that was sustained for 12 weeks), use of alternative MS medication, or treatment discontinuation due to lack of efficacy) was reduced by 19% in DAC HYP-treated subjects relative to IFN β-1a (hazard ratio (DAC HYP/IFN β-1a) of 0.81 (95% CI: 0.65 to 0.99; p = 0.0421)).’

This analysis refers to ‘treatment failure’ and the sponsor suggests that this is a potential surrogate for disease progression. In the sponsor’s quoted paragraph, though, treatment failure is defined to include, not just cases of progression, but all subjects that switched MS therapies or withdrew due to lack of efficacy.

It seems inevitable that a high proportion of subjects switching therapy or discontinuing for lack of efficacy did so because of relapses, and so this composite endpoint conflates a treatment benefit for which there is already good evidence (reduced relapses on DAC HYP) with one for which there is no solid evidence (reduced disease progression on DAC HYP).

All of these ‘sensitivity analyses’ are rejected. It must be concluded that across the full study cohort disease progression was not significantly affected by treatment allocation. Note that the term ‘sensitivity analysis’ is usually used for situations where the result is so robust it survives reanalysis with pessimistic or conservative methodology, not situations where a negative result can be rendered positive through optimistic assumptions.

Contrary to these conclusions, the sponsor argues:

While overall the primary and pre-specified sensitivity analyses were consistent with each other, the estimated treatment effects were stronger and reached statistical significance except when it was assumed that disability progression did not occur in any subject who was censored after a tentative disability progression (the primary analysis). This assumption of the primary analysis did not appear to be valid because the risk of confirmed disability progression was substantial after a tentative disability progression among subjects with 3-month confirmatory visits (34% in the IFN β-1a group and 37% in the DAC HYP group). Censoring after a tentative disability progression was nearly twice as common in the IFN β-1a group as in the DAC HYP group (43 subjects versus 24 subjects, respectively), reflecting a proportionally higher number of tentative disability progressions in the IFN β-1a group. While the number of subjects censored after a tentative disability progression (n = 67) was small relative to the total number of subjects with a tentative disability progression in the trial (n = 736), the assumptions made about disability progression in these censored subjects affected whether the test of statistical significance for disability progression was above or below the 0.05 significance threshold. Given this imbalance between the treatment arms and the considerable risk for disability progression expected in these subjects, the primary analysis cannot be assumed to have provided an unbiased estimate of the treatment effect. Overall, based on the pattern of censoring and the high risk of confirmed disability progression after a tentative disability progression, the sensitivity analyses are considered most likely to have provided the most accurate estimate of the treatment effect on 12-week confirmed disability progression in this study.’

These comments indicate that only about one third of tentative progressions, if followed up, converted to confirmed progressions. They also indicate that censoring after a tentative progression was nearly twice as common in the IFN β-1a group (possibly in part because relapses were twice as common in that group), so this group is the one that would end up with more cases of ‘assumed progression’ by any of the imputation methods proposed. The Evaluator agrees that it cannot be assumed the primary analysis was unbiased, but it cannot be assumed that the suggested alternative analyses were unbiased, either – and in the case of imputing 100% conversion rates from ‘tentative’ to ‘confirmed’, the method clearly conflates relapses and progression. A bigger problem is that, even if the suggested imputation methods were unbiased, they artificially inflate the power of the analyses, and even then only just reach nominally significant p-values. If appropriate adjustments were made for the Sponsor’s use of multiple analysis methods, it is likely that these additional analyses would no longer achieve even nominal significance.

In conclusion, this study did not demonstrate that DAC HYP prevents progression in comparison to IFN β-1a.

Figure 10. Time to sustained EDSS progression, Study 205MS301

figure 10. time to sustained edss progression, study 205ms301

When an additional analysis was performed using 24-week sustained progression (prespecified as a tertiary endpoint), the difference between the groups emerged as statistically significant: relative to IFN β-1a, DAC HYP was associated with a reduced risk of 24 week confirmed disability progression, expressed as a hazard ratio of 0.73 (95% CI: 0.55 to 0.98; p = 0.0332). These results have not been adjusted for multiplicity and cannot be considered statistically robust.


Proportion of subjects free from relapse

The primary analysis of the secondary endpoint, was based on INEC-confirmed relapses in the ITT population occurring between the first dosing date and the subject’s End of Treatment Visit or time of censoring. A total of 392 subjects (43%) in the IFN β-1a group and 260 subjects (28%) in the DAC HYP group had an INEC-confirmed relapse. The Kaplan-Meier estimate was used to assess the likely proportions of relapse-free subjects at different time points in the IFN β-1a and DAC HYP groups, respectively:

71.2% and 81.2% at 48 weeks;

58.5% and 72.9% at 96 weeks;

50.8% and 67.3% at 144 weeks.

The hazard ratio (DAC HYP/IFN β-1a) for the risk of relapse was 0.59 (95% CI: 0.50 to 0.69; p < 0.0001), which is a strong result for a head-to-head study, and consistent with the observed reduction in ARR.

Based on the proportions relapsing at 144 weeks (49.2% and 32.7% in the IFN β-1a and DAC HYP groups respectively), the relative proportion of relapsing subjects for DAC HYP was 66.5% (32.7/49.2) of the proportion observed with IFN β-1a. That is, the relative reduction in proportion relapsed was 33.5%. The sponsor’s clinical study report claimed that these results demonstrated that the risk of relapse was reduced by 41% in the DAC HYP group, compared to IFN β-1a but this inflated estimate presumably refers to the reduction in instantaneous hazard (hazard ratio = 0.59), as already discussed in the context of study MS201. Patients and clinicians are more likely to be interested in the risk reduction over a defined time period, which can be estimated as 33.5% for a period of 144 weeks. A similar calculation suggests that the reduction in the proportion relapsing at 48 weeks was approximately 34.7%(The proportions relapsing were 28.8% and 18.8%, in the IFN β-1a and DAC HYP groups, respectively, and 18.8/28.8 is 65.3%). Overall, by these calculations, the reduction in the proportion of subjects relapsing was about 34-35% with DAC HYP, compared to IFN β-1a, rather than the 41% suggested in the clinical study report. The PI also includes the inflated estimate of 41%, and it should be changed to reflect the actual reduction in risk over the course of the study.

Similar results for reductions in proportions relapsing were obtained with a number of sensitivity analyses (not shown in this evaluation).

Figure 11. Time to first relapse, Study 205MS301



figure 11. time to first relapse, study 205ms301
Proportion of subjects with a ≥ 7.5-point worsening from baseline in the MSIS-29 physical impact score at Week 96

The MSIS-29 Physical Impact score was assessed with a logistic regression model that included adjustments for the baseline Physical Impact score, baseline BDI, history of prior IFN β use, and baseline age (age ≤ 35 versus age > 35 years). Data imputation was required at Week 96 data for a substantial of patients: 202 subjects in the IFN β-1a group and 169 subjects in the DAC HYP group.

At the pre-specified main time point of 96 weeks, 213 subjects (23%) in the IFN β-1a group had a ≥ 7.5-point worsening from baseline, compared with 171 subjects (19%) in the DAC HYP group. The difference was statistically significant, with an odds ratio (DAC HYP/IFN-β 1a) of 0.76 (95% CI: 0.60 to 0.95; p = 0.0176).

A direct comparison of the proportions showing worsening (19% versus 23%) produces a ratio of 83% (19/23 = 0.826), or a relative improvement of approximately 17% for this endpoint. The absolute difference of 4% appears to be of rather modest clinical benefit.

Sensitivity analyses of this endpoint, including assessments in the PP Population, produced broadly similar results.

Because IFN β-1a is associated a number of side effects including fatigue, flu-like malaise, spasm and depression, improvements in quality of life relative to interferon may partly reflect adverse effects of interferon rather than efficacy benefits of DAC HYP. The comparison with placebo, in the previous pivotal study, did not achieve significance.

Subgroup analyses

The sponsor performed subgroup analyses for the ARR, based on a number of different baseline prognostic factors including number of relapses in different time frames, EDSS scores, MRI characteristics, time since diagnosis and previous treatment. These are summarised in the Forest plot below (see Figure 12). In all but one subgroup, a significant treatment effect was observed, despite the fact that the subgroups had less subjects and the analysis had less power than the original analysis with the full cohort. The exception was the subgroup of patients with EDSS ≥ 3.5 at baseline. In this group, there was a trend to benefit with DAC HYP, but the hazard ratio estimate was less favourable than for other subgroups and the 95% CI included unity. This suggests that subjects with more advanced disease (who are more likely to have reached a SPMS stage of the illness rather than having pure RRMS) may be less responsive to treatment.

Figure 12. Annualised relapse rate by demographic subgroups, Study 205MS301



figure 12. annualised relapse rate by demographic subgroups, study 205ms301

Figure 12. Annualised relapse rate by demographic subgroups, Study 205MS301 (continued)

figure 12. annualised relapse rate by demographic subgroups, study 205ms301 (continued)

Figure 12. Annualised relapse rate by demographic subgroups, Study 205MS301 (continued)

figure 12. annualised relapse rate by demographic subgroups, study 205ms301 (continued)

The sponsor also performed subgroup analyses for the main MRI endpoint, new or enlarging T2 lesions, and these consistently favoured DAC HYP, as shown in Figure 13 below. All subgroups in the table show a significant benefit for DAC HYP.

Figure 13. New or newly-enlarging T2 lesions by baseline disease characteristics

figure 13. new or newly-enlarging t2 lesions by baseline disease characteristics

Figure 13. New or newly-enlarging T2 lesions by baseline disease characteristics (continued)

figure 13. new or newly-enlarging t2 lesions by baseline disease characteristics (continued)

Subgroup analyses for the endpoint of sustained disability progression did not identify any subgroup in which the effects of DAC HYP and IFN β-1a were significantly different. This reflects the primary analysis of this endpoint in the full cohort, where a favourable trend was identified but no significant difference was observed.

Figure 14. Sustained disability progression (measured by increase in EDSS) by baseline disease characteristics

figure 14. sustained disability progression (measured by increase in edss) by baseline disease characteristics

Figure 14. Sustained disability progression (measured by increase in EDSS) by baseline disease characteristics (continued)

figure 14. sustained disability progression (measured by increase in edss) by baseline disease characteristics (continued)

Subgroup analysis for high disease activity versus low disease activity

A new subgroup analysis based on high disease activity versus low disease activity was suggested by the EMA, and is discussed in the following section concerning data supplied following EMA questions. Prior to receiving this request, the sponsor had already conducted their own subgroup analysis of high disease activity and low disease activity subgroups, and these results are included in the figures above. Although a similar analysis in Study 205MS201 had been post hoc, the definition of these subgroups in Study 205MS301 was prospective, and subjects were categorized as having low or high disease activity at baseline. High disease activity was defined as ≥ 2 relapses in the year prior to randomisation and ≥ 1 Gd lesion on the baseline MRI.

By this definition, there was a similar proportion of low disease activity subjects in both treatment groups, as shown in the figures above (IFN β-1a, 713/917, 77.8%; DAC HYP 723/907, 79.7%). The superiority over IFN β-1a was statistically significant in both subgroups (see Figure 7, above). There was a trend to DAC HYP producing a better reduction in ARR in the high-disease-activity subgroup than in the low-disease-activity subgroup.

For the secondary endpoint of new T2 lesions, a similar benefit was observed in both subgroups, and each subgroup achieved a statistically significant result showing superiority of DAC HYP relative to IFN β-1a.

For disease progression, a significant benefit with DAC HYP was not demonstrated for the cohort as a whole, or for any subgroup, including subgroups defined on the basis of disease activity.



Yüklə 342,49 Kb.

Dostları ilə paylaş:
1   2   3   4   5   6   7   8   9




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə