The characteristics of the 881 subjects enrolled in the various centres and of the 7017 considered cycles, with their outcomes, are summarised in Tables 2, 3 and 4. The number of subjects and contributed cycles varied markedly between centres and consequently, in order to obtain meaningful fecundability patterns from the analysis, some aggregation of data was made. In most analyses the data from Auckland were kept separate from those of the European centres owing to their specific features mentioned in [2.1] having an impact on the level of fecundability.
The average age of women in the study population was close to 29 years and was relatively similar at each centre (Table 2). The proportions of women of proven fertility and of those with past use of hormonal contraception are, however, very different among the centres. For the European centres overall, the percentage of women with at least one previous pregnancy was only 44.6% (range for centres: 30.8 - 73.1) while only 30.1% (range for centres: 11.4 - 56.2) had ever used hormonal contraception in the past (Table 2).
For these same centres, Table 3 underlines the high frequency of cases (96.4%) in which, when enough information was available, the described procedure allowed the BBT shift to be determined. However, when at least some information on temperature was recorded, in further 6.1% of the cycles the reference day could not be identified due to missing information on critical days, and in 1.6% due to disturbing illness. The proportion of cycles with determination -in similar conditions- of the mucus reference day is a little lower (94.1), owing to the particularly low percentage of the Paris subgroup. At that centre, in local usage, mucus symptoms are taken into consideration mainly for identification of the beginning of the "fertile" phase. The 575 detected pregnancies listed according to centres in Table 3 include both those continuing at 60 days from the onset of the last menses and the 49 clinically recognised miscarriages of the same period (also listed).
The figures of Table 4 -5591 cycles with BBT reference day (Table 4a) and 5928 with mucus reference day (Table 4b) - are linked with a conventional determination of the post-ovulatory phases starting after the respective reference days. They give an impression of a remarkable homogeneity between centres. The length of the phase after the peak mucus day in the various centres parallels similar results obtained in the WHO [World Health Organization 1983] study on the ovulation method. As expected, the length of the preovulatory phase shows a relative variability higher than that of the postovulatory one: e.g., for the European aggregate the coefficient of variation (4.74/16.7) is 25.7% in the first vs. 16.2% in the second.
It has to be noted that the two samples - with information on BBT and/or mucus - coincide in a sizeable proportion of cycles (5390 in the combined European group, 232 in Auckland: in the two sets of data both surrogate markers of ovulation were determined in about 80% of the cycles). On average, the peak mucus symptom occurred 0.31 days (S.d. 1.82) before the last low temperature day in the European group (0.30 with S.d. 1.83 when the Auckland data were included).
The database can also be used in various forms to study the behaviour of the subjects. Table 5, showing the decline in the frequency of intercourse with the increasing age of each of the partners, provides an example. Three points have to be considered: the number of men above 40 is rather small; in conception cycles only acts of intercourse up to the 29th day of the cycle were counted; for obvious reasons, the data are for European centres only. The trend with age, evaluated through the arithmetic average (preferred to the median for sake of better evidence), and the higher coefficient of variation in non-conception cycles (61.3% vs. 49.7%), both support the reliability of the data collected. The small variations between the male and the female findings reflect differences in the number of subjects in the various classes and on the whole. For female partners, over all age groups, the median number of recorded acts of intercourse (10th, 90th percentiles) is equal to 6 days (3,11) in the conception cycles and to 4 (1,8) in the non-conception cycles.
Table 6 lists the distribution of 5390 cycles according to the interval in days between the two markers of ovulation (BBT reference day minus mucus reference day). We know already - from [3.1] - the value of the average distance between those days. There is some translation between the two reference terms, which -though small - can influence the comparative distributions of cycles, and of intercourse episodes and pregnancies allocated to the various days of the respective fecundability window. In the majority (62.4%) of the cycles the two markers are within ± one day and the difference is greater than ± two days in 17% of the cycles. This suggests that estimates of day-specific pregnancy probabilities should not depend greatly on which marker is used for ovulation. However, we cannot rule out possible overestimation of the fertile interval relative to BBT or mucus reference day compared with the width of the fertile interval relative to the true day of ovulation. Although efforts were made to rule out errors in documentation of BBT or cervical mucus, measurement errors can result due to unavoidable biological variability. In future work, such measurement errors could be assessed and corrected using recently developed statistical methodology [Dunson and Weinberg 2000, Dunson et al in press].
3.2 Fertility Windows: Direct Estimates of Fecundability
In order to find windows of fertility - around the BBT or the mucus reference day - to be used for estimates of daily fecundability, an exploratory analysis was made, changing width and location of chosen windows. For each reference marker, it was found that, when no intercourse episodes were ascertained in a 12-day window, no pregnancy was recorded. Eight among the 12 days preceded the day 0 and three came afterwards.
Then, direct estimates of daily fecundability were computed inside these windows. In this initial determination, only cycles with a single act of intercourse in a window were selected. The ratio of instances in which the acts of one day resulted in conception to the total number of acts of intercourse of the same day gave, for that day, an estimate of the probability of conception. The results are presented in Table 7 for the combined European centres (top section) and with inclusion of Auckland for all centres (bottom section). The differences in the number of cycles between the bottom and the top grouping give the contribution from Auckland. The two sets of probabilities are very different, particularly when the impact of the Auckland data, in terms of number of conception cycles, is relevant: direct estimates obtained for this site are on the average about double those of the European ones. It is worth mentioning that no one of the almost 350 intercourse episodes of the third day of the high BBT gave rise to a conception. And also that Auckland conforms to the other centres concerning the width of the window, which might be shorter, even when due account is taken of the smaller sample size.
A similar exercise was performed, with data only from European centres, with the aim of obtaining more precise fecundability estimates by increasing the number of contributing cycles through use of a smaller window, in which the probability of having single intercourse episodes is increased. Cycles, however, were eliminated from consideration in which, while only a single act of intercourse occurred in the shorter window, conception might have been due (though certainly with a small probability) not to that coital act but to intercourse episodes falling outside the window. From this point of view, were considered relevant, for cycles having intercourse on day -6, the three days -9, -8, -7, reduced to two (-8, -7) for cycles with intercourse on day -5, and to one (-7) in cycles with intercourse on day -4. Similarly, were excluded from the analysis cycles with intercourse on day +2. The elaboration was extended to evaluate a parallel window around the mucus reference day. The results for both analyses are shown in Table 8. In absolute terms, the main differences between the two sets of probability are observed on days -3 and 0. Considering - besides random errors and the small shift in BBT versus mucus - that the two aggregates of cycles are different, the estimates of fecundability, daily and total, appear in good agreement. Worthy of attention is the finding that the peak mucus day is not the one with maximum fecundability. In each aggregate, the four days preceding the reference day are the most relevant for cycle fecundability.
3.3 Estimates through a Model
In the presence of multiple acts of intercourse during the fertile interval of a cycle, the probability of conception due to a single act on any day cannot be estimated directly. One has to make use of a model whose computed coefficients may lead to an evaluation of daily fecundability. For this purpose, in the following, estimates of day by day conception probabilities are obtained through the application of the Schwartz model [Schwartz, MacDonald, and Heuchel 1980], summarised in [2.5.1]. This model has been repeatedly used in the literature, and by that it allows comparisons with other experiences.
The model estimates of daily fecundability for the European subjects are presented in Table 9, with confidence intervals obtained through the profile maximum likelihood [Clayton and Hills 1993], at the 90% level. The chosen windows are those already seen. The two sets of data have a different composition, but once again they underline in both cases the significance of higher rates in the four days preceding the respective reference day.
In Figure 1, the daily estimates relative to each of the two markers of ovulation are presented. These estimates are based on the 5390 cycles from the European centres for which both reference days are available. There is a total of only 386 pregnancies, since for 48 there is information only on the peak mucus day, for 49 only on BBT shift, and nothing in 4 instances. The given confidence intervals are at the 90% level. Several points may be mentioned: a) in the two sets of estimates, though the total number of cycles is the same, the number of those with at least one intercourse episode in the window differs: 2917 for BBT and 2843 for mucus, respectively. This difference will have an effect, though small, on the respective areas under the curve; b) one has to remember the mentioned average distance between the two reference days and its possible effects (see para 7 of [3.1]); c) the estimates based on the mucus symptom conform less well to a bell shaped pattern as observed with the BBT window; d) the dip at day -3 found through the mucus symptom repeats what seen in the data set of Table 9 and also in the direct estimates of Table 8: a point deserving further elaboration.
It appears that the BBT reference day may be a slightly better (i.e. less error prone) marker of ovulation day, since the estimates, compared with those around the mucus reference day, are higher on the days of peak fertility (i.e. days -3 to -1) and lower on the days towards the edge of the window.
In Table 10 the results for the 12 days BBT window are compared with fecundability estimates reported from five other similar studies. A few notes will clarify the limits of these comparisons. The discrepancies between the different sets of probabilities can be attributed -apart from random errors- to different characteristics of the subjects, to distinct procedures followed in determining the ovulation reference day and to the inclusion or exclusion of early miscarriage in the counted pregnancies. The probabilities reported by Schwartz et al. [Schwartz et al 1979] are direct estimates from single donor artificial inseminations per cycle by donors. The data by Weinberg et al. [Weinberg et al 1998] and by Wilcox et al. [Wilcox, Weinberg, and Baird 1998] come from recruitment from the general population of subjects wanting to achieve a pregnancy. In the other two studies, the information was collected in centres providing services on fertility awareness and natural fertility regulation. Weinberg et al [Weinberg et al 1998] were able to include through assay of hCG very early pregnancies losses, otherwise undetected by clinical diagnoses. In the same set of pregnancies, Wilcox et al. [Wilcox, Weinberg, and Baird 1998] considered only those clinically diagnosed, that is events more similar to those considered in the present aggregate of European centres. In the other studies there were no important differences in the recording of pregnancies. In conception cycles with multiple acts of intercourse in the "fertile" window, Bremme [Bremme 1991] chose to assign pregnancy to the intercourse which occurred closest in time prior to or coinciding with the presumed day of "ovulation": a procedure leading to a bias which increased fecundability rates as the "ovulation" day was approached. For the probabilities computed in Weinberg et al [Weinberg et al 1998] and in Wilcox et al. [Wilcox, Weinberg, and Baird 1998] ovulation day (i.e. day 0) was identified using the decline in the ratio of oestrogen to progesterone metabolites in the urine that accompanies luteinization of the ovarian follicle [Baird et al 1991]. This steroid based marker should be less error-prone than markers on BBT or mucus, but should not deviate systematically from the last day of low temperature used in the other studies, as in the present data base. Apart from Bremme and Schwartz et al [Schwartz et al 1979], the other four sets of estimates were based on the Schwartz model [2.5.1].
Figure 2 shows a graphical comparison of the pattern of conception probabilities in the BBT window for four subgroups (centres or combinations of centres) and for the whole European experience. The results for the Auckland subjects clearly differ from those of the other instances. The other three subgroups consisted of the Verona centre, Milan aggregated with Lugano because of similarity of NFP teaching content and method, and the four remaining European centres combined because of their small sample sizes. The homogeneity of the fecundability data between the three European subsets is striking. The maximum likelihood ratio test of significance of the differences between the three European subsets gives p>0.10. The merging of their records in a unique European group appears reasonable: this will form the basis of all subsequent analyses on the level of fecundability
Figures 3, 4 and 5 focus on the link between three covariates pertaining to the female subjects and fecundability in the window around the BBT reference day. The covariates evaluated are: the reproductive history of the woman, by comparing subjects with and without a previous pregnancy (Figure 3); the woman's age, by dividing the subjects into three age groups, 18-24 yrs (103 subjects), 25-34 yrs (596), and 35-39yrs (83; Figure 4); and past use or non use of oral contraception (Figure 5). The difference in the level of fecundability of the women of proven fertility versus the unproven group is very significant (p = 0.014). In the group with unproven fertility, though the subjects obviously believed they were fertile, their number would include some with undiagnosed infertility or sub-fertility as in the general population. Furthermore, at least in one Italian centre, subjects may have been included in the study who were seeking help in achieving a pregnancy after a prolonged experience of failure. No marked differences in fecundability rates were observed in the three age groups (p>0.10), though the sample sizes in the younger and older groups are relatively small. When the subjects were divided into those below and those above the median age (29 years), again no significant difference in fecundability was found between the two groups (p>0.10, data not shown). Similarly, no significant differences (p>0.10) are seen in the daily fecundability when comparisons are made between past use or no previous use of oral contraception. It should be noted, however, that the number of women having used this method of contraception in the three cycles preceding their entry into the study is extremely low (3.0%).
Two further results pertaining to the cycles are presented in Figures 6 and 7. Figure 6 is based on the data of Table 6. The whole set of cycles is divided into three groups according to the time difference between the BBT reference day and the peak mucus day: group 1, negative difference (1569 cycles, 29.1% of the total); group 2, difference equal to 0 and 1 days (2553, 47.4%); group 3, greater than 1 day (1268, 23.5%). For each of the three derived sub-sets the Figure shows the pattern of estimated daily conception probabilities. Attention is drawn to the sub-set in which the two reference points (almost) coincide, and therefore should support each other as giving a rather good approximate indication on the time of ovulation. The pattern of conception probabilities appears very concentrated, falling after a continuous rise extending over five days, with a maximum at day -2, approaching zero at both extremes (see also Wilcox et al. [Wilcox, Weinberg, and Baird 1998]). The pattern is somewhat similar in group 3, though more elevated at beginning of the ascending part and then falling abruptly on day zero, remaining then at this level. When the peak mucus day occurs after the BBT reference day (group 1) the probability pattern is very irregular with two maxima (on day -3 and day 0). The difference between the three sets of probabilities is very significant (p=0.020).
Figure 7 illustrates the pattern of daily fecundability for two different subsets of cycles, one with the window around the BBT shift (3175 cycles with at least one intercourse in the window, 434 pregnancies) and the other with the window around the mucus reference day (3265 cycles, 435 pregnancies). The two subsets are each further divided according to the length of the conventional follicular phase of the cycles, <16 days and > = 16 days The very different shape of the two derived patterns of fecundability is highly significant (p=0.003 for BBT, p<0.001 for mucus). The differences in probability levels on, say, day -4 depending on the said length is very strong. Evidently the distance -4 does not have the same meaning for all cycles: as does the distance at day zero, though with inverse relationship in the probabilities of the two subsets. The evidence is the same for both BBT and mucus which tends to exclude systematic errors in the identification of the reference days as an explanation. There is a biological foundation for such a result or does this serve as a hint to consider more stable the positioning of ovulation in the cycle and more variable that of the conventional surrogate indicators?
Daily Fecundability: First Results from a New Data Base
Bernardo Colombo, Guido Masarotto
© 2000 Max-Planck-Gesellschaft ISSN 1435-9871